SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 226–250 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
Flaming-hot Initiation with Regular Execution Sampling for Large Language Models	Oct 28, 2024	DiversityMath	CodeCode Available	2	5
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning	Oct 5, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	2	5
Archon: An Architecture Search Framework for Inference-Time Techniques	Sep 23, 2024	Hyperparameter OptimizationInstruction Following	CodeCode Available	2	5
AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions	Jun 10, 2025	Math	CodeCode Available	2	5
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO	May 28, 2025	MathReinforcement Learning (RL)	CodeCode Available	2	5
MathPile: A Billion-Token-Scale Pretraining Corpus for Math	Dec 28, 2023	Language IdentificationMath	CodeCode Available	2	5
Memorizing Transformers	Mar 16, 2022	Language ModelingLanguage Modelling	CodeCode Available	2	5
On the Emergence of Thinking in LLMs I: Searching for the Right Intuition	Feb 10, 2025	Math	CodeCode Available	2	5
SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models	Jan 15, 2024	MathMathematical Reasoning	CodeCode Available	2	5
Expression Syntax Information Bottleneck for Math Word Problems	Oct 24, 2023	Math	CodeCode Available	1	5
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models	Apr 14, 2025	MambaMath	CodeCode Available	1	5
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs	Jun 24, 2024	Instruction FollowingMath	CodeCode Available	1	5
A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods	Feb 3, 2025	MathMathematical Reasoning	CodeCode Available	1	5
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?	Feb 26, 2025	Math	CodeCode Available	1	5
Explaining Datasets in Words: Statistical Models with Natural Language Parameters	Sep 13, 2024	ClusteringLanguage Modeling	CodeCode Available	1	5
Can an AI Win Ghana's National Science and Maths Quiz? An AI Grand Challenge for Education	Jan 30, 2023	MathPosition	CodeCode Available	1	5
A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning	Jul 11, 2025	MathMathematical Reasoning	CodeCode Available	1	5
LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation	Mar 25, 2025	Code CompletionLanguage Modeling	CodeCode Available	1	5
Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective	Jun 22, 2025	In-Context LearningLarge Language Model	CodeCode Available	1	5
LLMThinkBench: Towards Basic Math Reasoning and Overthinking in Large Language Models	Jul 5, 2025	BenchmarkingGPU	CodeCode Available	1	5
EXAONE Deep: Reasoning Enhanced Language Models	Mar 16, 2025	Math	CodeCode Available	1	5
LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks	Oct 16, 2024	Mathparameter-efficient fine-tuning	CodeCode Available	1	5
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models	Feb 2, 2024	Language ModellingLarge Language Model	CodeCode Available	1	5
Building Dataset for Grounding of Formulae — Annotating Coreference Relations Among Math Identifiers	Jun 1, 2022	Math	CodeCode Available	1	5
Broken Neural Scaling Laws	Oct 26, 2022	Adversarial RobustnessContinual Learning	CodeCode Available	1	5

Show:10 25 50

← PrevPage 10 of 64Next →

No leaderboard results yet.