SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 381–390 of 1596 papers

Title	Date	Tasks	Status	Hype
Implicit Chain of Thought Reasoning via Knowledge Distillation	Nov 2, 2023	Knowledge DistillationMath	CodeCode Available	1
Design of Chain-of-Thought in Math Problem Solving	Sep 20, 2023	DiversityGSM8K	CodeCode Available	1
How well do Large Language Models perform in Arithmetic tasks?	Mar 16, 2023	Math	CodeCode Available	1
Improving the Validity of Automatically Generated Feedback via Reinforcement Learning	Mar 2, 2024	MathMisconceptions	CodeCode Available	1
How to Get Your LLM to Generate Challenging Problems for Evaluation	Feb 20, 2025	Code CompletionMath	CodeCode Available	1
DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning	Jun 6, 2024	Math	CodeCode Available	1
Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation Capabilities	Feb 17, 2025	Code GenerationHumanEval	CodeCode Available	1
HARP: A challenging human-annotated math reasoning benchmark	Dec 11, 2024	Math	CodeCode Available	1
HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics	Oct 13, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Documents	Nov 16, 2023	Math	CodeCode Available	1

Show:10 25 50

← PrevPage 39 of 160Next →

No leaderboard results yet.