SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 611–620 of 1596 papers

Title	Date	Tasks	Status	Hype
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces	Oct 13, 2024	Computational EfficiencyMath	—Unverified	0
HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics	Oct 13, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Expanding Search Space with Diverse Prompting Agents: An Efficient Sampling Approach for LLM Mathematical Reasoning	Oct 13, 2024	MathMathematical Reasoning	—Unverified	0
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models	Oct 12, 2024	Mathreinforcement-learning	CodeCode Available	5
Testing GPT-4-o1-preview on math and science problems: A follow-up study	Oct 11, 2024	MathSpatial Reasoning	—Unverified	0
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization	Oct 11, 2024	GSM8KLanguage Modeling	CodeCode Available	2
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights	Oct 11, 2024	GSM8KMath	CodeCode Available	4
The Geometry of Concepts: Sparse Autoencoder Feature Structure	Oct 10, 2024	Math	CodeCode Available	1
VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models	Oct 10, 2024	Math	CodeCode Available	2
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models	Oct 10, 2024	GSM8KMath	CodeCode Available	2

Show:10 25 50

← PrevPage 62 of 160Next →

No leaderboard results yet.