SOTAVerified|Agents Browse Leaderboard About Blog

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1471–1480 of 1596 papers

Title	Date	Tasks	Status	Hype
How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark	May 24, 2025	Math	CodeCode Available	0
How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study	May 21, 2025	Math	CodeCode Available	0
World Models for Math Story Problems	Jun 7, 2023	Math	CodeCode Available	0
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks	Oct 14, 2024	FairnessGSM8K	CodeCode Available	0
ChatBench: From Static Benchmarks to Human-AI Evaluation	Mar 22, 2025	MathMMLU	CodeCode Available	0
Augmented Math: Authoring AR-Based Explorable Explanations by Augmenting Static Math Textbooks	Jul 30, 2023	MathOptical Character Recognition	CodeCode Available	0
When an LLM is apprehensive about its answers -- and when its uncertainty is justified	Mar 3, 2025	MathMMLU	CodeCode Available	0
Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions?	May 10, 2024	Mathtext similarity	CodeCode Available	0
Skellam Mixture Mechanism: a Novel Approach to Federated Learning with Differential Privacy	Dec 8, 2022	Federated LearningMath	CodeCode Available	0
Classifying Math KCs via Task-Adaptive Pre-Trained BERT	May 24, 2021	MathPrediction	CodeCode Available	0

Show:10 25 50

← PrevPage 148 of 160Next →

No leaderboard results yet.