SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 391–400 of 1596 papers

Title	Date	Tasks	Status	Hype
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models	Feb 22, 2024	MathMathematical Reasoning	CodeCode Available	1
Language Models as Science Tutors	Feb 16, 2024	GSM8KMath	CodeCode Available	1
GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving	Feb 15, 2024	Geometry Problem SolvingMath	CodeCode Available	1
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data	Feb 14, 2024	Automated Theorem ProvingLanguage Modelling	CodeCode Available	1
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation	Feb 5, 2024	Knowledge GraphsMath	CodeCode Available	1
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models	Feb 2, 2024	Language ModellingLarge Language Model	CodeCode Available	1
ReGAL: Refactoring Programs to Discover Generalizable Abstractions	Jan 29, 2024	Date UnderstandingMath	CodeCode Available	1
TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks	Jan 23, 2024	MathQuestion Answering	CodeCode Available	1
Over-Reasoning and Redundant Calculation of Large Language Models	Jan 21, 2024	GSM8KMath	CodeCode Available	1
Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning	Jan 19, 2024	GSM8KMath	CodeCode Available	1

Show:10 25 50

← PrevPage 40 of 160Next →

No leaderboard results yet.