SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 626–650 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
Mind Scramble: Unveiling Large Language Model Psychology Via Typoglycemia	Oct 2, 2024	Language ModelingLanguage Modelling	CodeCode Available	0	5
MIRB: Mathematical Information Retrieval Benchmark	May 21, 2025	Automated Theorem ProvingInformation Retrieval	CodeCode Available	0	5
Meta-Reasoning Improves Tool Use in Large Language Models	Nov 7, 2024	Math	CodeCode Available	0	5
How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study	May 21, 2025	Math	CodeCode Available	0	5
How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark	May 24, 2025	Math	CodeCode Available	0	5
metboost: Exploratory regression analysis with hierarchically clustered data	Feb 13, 2017	MathMissing Values	CodeCode Available	0	5
How Do Humans Write Code? Large Models Do It the Same Way Too	Feb 24, 2024	Code GenerationMath	CodeCode Available	0	5
ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models	May 22, 2025	Large Language ModelMath	CodeCode Available	0	5
Misplaced Trust: Measuring the Interference of Machine Learning in Human Decision-Making	May 22, 2020	BIG-bench Machine LearningDecision Making	CodeCode Available	0	5
mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models	Jun 4, 2024	Math	CodeCode Available	0	5
MAWPS: A Math Word Problem Repository	Jun 1, 2016	MathMath Word Problem Solving	CodeCode Available	0	5
Heteroclinic cycling and extinction in May-Leonard models with demographic stochasticity	Nov 10, 2021	MathUnity	CodeCode Available	0	5
ComSearch: Equation Searching with Combinatorial Strategy for Solving Math Word Problems with Weak Supervision	Oct 13, 2022	Math	CodeCode Available	0	5
Math Word Problem Solving by Generating Linguistic Variants of Problem Statements	Jun 24, 2023	DecoderIngenuity	CodeCode Available	0	5
Algebra Error Classification with Large Language Models	May 8, 2023	ClassificationMath	CodeCode Available	0	5
Helpful assistant or fruitful facilitator? Investigating how personas affect language model behavior	Jul 2, 2024	Language ModelingLanguage Modelling	CodeCode Available	0	5
ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark	May 28, 2025	Math	CodeCode Available	0	5
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning	Feb 27, 2024	8kLanguage Modeling	CodeCode Available	0	5
Computationally Identifying Funneling and Focusing Questions in Classroom Discourse	Jul 8, 2022	Math	CodeCode Available	0	5
MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark	Aug 14, 2024	MathMathematical Reasoning	CodeCode Available	0	5
Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal Models	May 26, 2025	Contrastive LearningMath	CodeCode Available	0	5
Compositional Processing Emerges in Neural Networks Solving Math Problems	May 19, 2021	MathMathematical Reasoning	CodeCode Available	0	5
MathScale: Scaling Instruction Tuning for Mathematical Reasoning	Mar 5, 2024	GSM8KMath	CodeCode Available	0	5
HARDMath2: A Benchmark for Applied Mathematics Built by Students as Part of a Graduate Class	May 17, 2025	MathMathematical Problem-Solving	CodeCode Available	0	5
Complex Mathematical Symbol Definition Structures: A Dataset and Model for Coordination Resolution in Definition Extraction	May 24, 2023	Definition ExtractionMath	CodeCode Available	0	5

Show:10 25 50

← PrevPage 26 of 64Next →

No leaderboard results yet.