SOTAVerified

Math

Papers

Showing 776800 of 1596 papers

TitleStatusHype
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning0
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity0
Extracting the Unknown from Long Math Problems0
Fairness Hub Technical Briefs: AUC Gap0
Fairshare Data Pricing via Data Valuation for Large Language Models0
FANS -- Formal Answer Selection for Natural Language Math Reasoning Using Lean40
BurTorch: Revisiting Training from First Principles by Coupling Autodiff, Math Optimization, and Systems0
Fast Diffusion Inhibits Disease Outbreaks0
Faster and Better LLMs via Latency-Aware Test-Time Scaling0
Feature Selection Based on Confidence Machine0
The Impact of Item-Writing Flaws on Difficulty and Discrimination in Item Response Theory0
Few-Shot Recalibration of Language Models0
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning0
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models0
The Invalsi Benchmarks: measuring Linguistic and Mathematical understanding of Large Language Models in Italian0
Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models0
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning0
Fixation probabilities for the Moran process in evolutionary games with two strategies: graph shapes and large population asymptotics0
Fixation probabilities for the Moran process with three or more strategies: general and coupling results0
Building Math Agents with Multi-Turn Iterative Preference Learning0
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration0
The Logic of Political Survival Revisited: Consequences of Elite Uncertainty Under Authoritarian Rule0
Formal Mathematical Reasoning: A New Frontier in AI0
The Long-Term Effects of Teachers' Gender Stereotypes0
fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models0
Show:102550
← PrevPage 32 of 64Next →

No leaderboard results yet.