SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 376–400 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges	May 21, 2025	Mathvalid	CodeCode Available	1	5
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models	Aug 30, 2024	Image CaptioningLanguage Modeling	CodeCode Available	1	5
A Relation Spectrum Inheriting Taylor Series: Muscle Synergy and Coupling for Hand	Apr 25, 2020	MathRelation	CodeCode Available	1	5
A Symbolic Character-Aware Model for Solving Geometry Problems	Aug 5, 2023	MathMulti-Label Classification	CodeCode Available	1	5
Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning	Jun 4, 2023	Math	CodeCode Available	1	5
Mathfish: Evaluating Language Model Math Reasoning via Grounding in Educational Curricula	Aug 8, 2024	GSM8KLanguage Modeling	CodeCode Available	1	5
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees	Mar 11, 2025	ChatbotLanguage Modeling	CodeCode Available	1	5
CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning	Oct 14, 2024	MathMathematical Reasoning	CodeCode Available	1	5
Collective Constitutional AI: Aligning a Language Model with Public Input	Jun 12, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
A Categorical Archive of ChatGPT Failures	Feb 6, 2023	Math	CodeCode Available	1	5
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data	Feb 14, 2024	Automated Theorem ProvingLanguage Modelling	CodeCode Available	1	5
Entropy-Regularized Process Reward Model	Dec 15, 2024	GSM8KMath	CodeCode Available	1	5
Memory-Efficient and Secure DNN Inference on TrustZone-enabled Consumer IoT Devices	Mar 19, 2024	Math	CodeCode Available	1	5
Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation Capabilities	Feb 17, 2025	Code GenerationHumanEval	CodeCode Available	1	5
MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reports	May 16, 2025	DiagnosticMath	CodeCode Available	1	5
Entropy-Based Adaptive Weighting for Self-Training	Mar 31, 2025	GSM8KMath	CodeCode Available	1	5
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical Mapping	Feb 16, 2025	Code GenerationInstruction Following	CodeCode Available	1	5
Language Models Encode the Value of Numbers Linearly	Jan 8, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning	Jan 27, 2023	Few-Shot LearningGSM8K	CodeCode Available	1	5
Eliminating Position Bias of Language Models: A Mechanistic Approach	Jul 1, 2024	Mathobject-detection	CodeCode Available	1	5
Large Language Models Can Be Easily Distracted by Irrelevant Context	Jan 31, 2023	Arithmetic ReasoningLanguage Modeling	CodeCode Available	1	5
Large (Vision) Language Models are Unsupervised In-Context Learners	Apr 3, 2025	GSM8KIn-Context Learning	CodeCode Available	1	5
Discovering Mathematical Objects of Interest -- A Study of Mathematical Notations	Feb 7, 2020	Information RetrievalMath	CodeCode Available	1	5
Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning	Jan 19, 2024	GSM8KMath	CodeCode Available	1	5
Efficient Reasoning for LLMs through Speculative Chain-of-Thought	Apr 27, 2025	GSM8KMath	CodeCode Available	1	5

Show:10 25 50

← PrevPage 16 of 64Next →

No leaderboard results yet.