SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 711–720 of 1596 papers

Title	Date	Tasks	Status	Hype
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?	Mar 23, 2025	GSM8KMath	CodeCode Available	0
Long Is More Important Than Difficult for Training Reasoning Models	Mar 23, 2025	Math	—Unverified	0
ChatBench: From Static Benchmarks to Human-AI Evaluation	Mar 22, 2025	MathMMLU	CodeCode Available	0
Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them	Mar 20, 2025	MathMemorization	—Unverified	0
BurTorch: Revisiting Training from First Principles by Coupling Autodiff, Math Optimization, and Systems	Mar 18, 2025	CPUMath	—Unverified	0
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs	Mar 18, 2025	GSM8KMath	—Unverified	0
Pensez: Less Data, Better Reasoning -- Rethinking French LLM	Mar 17, 2025	Large Language ModelMath	—Unverified	0
Improving Complex Reasoning with Dynamic Prompt Corruption: A soft prompt Optimization Approach	Mar 17, 2025	GSM8KMath	—Unverified	0
SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially?	Mar 16, 2025	Board GamesCard Games	—Unverified	0
The Impact of Item-Writing Flaws on Difficulty and Discrimination in Item Response Theory	Mar 13, 2025	MathMultiple-choice	—Unverified	0

Show:10 25 50

← PrevPage 72 of 160Next →

No leaderboard results yet.