SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–160 of 1596 papers

Title	Date	Tasks	Status	Hype
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models	Feb 1, 2025	Math	CodeCode Available	2
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate	Jan 29, 2025	Instruction FollowingMath	CodeCode Available	2
Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling	Jan 20, 2025	Imitation LearningLanguage Modeling	CodeCode Available	2
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics	Jan 8, 2025	MathMathematical Reasoning	CodeCode Available	2
Offline Reinforcement Learning for LLM Multi-Step Reasoning	Dec 20, 2024	GSM8KMath	CodeCode Available	2
ProcessBench: Identifying Process Errors in Mathematical Reasoning	Dec 9, 2024	GSM8KMath	CodeCode Available	2
Preference Optimization for Reasoning with Pseudo Feedback	Nov 25, 2024	GSM8KMath	CodeCode Available	2
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training	Nov 24, 2024	MathMixture-of-Experts	CodeCode Available	2
Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus	Nov 19, 2024	Formal LogicLogical Reasoning	CodeCode Available	2
Flaming-hot Initiation with Regular Execution Sampling for Large Language Models	Oct 28, 2024	DiversityMath	CodeCode Available	2

Show:10 25 50

← PrevPage 16 of 160Next →

No leaderboard results yet.