SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1526–1550 of 1596 papers

Title	Date	Tasks	Status	Hype	Score
Tighter 'uniform bounds for Black-Scholes implied volatility' and the applications to root-finding	Feb 17, 2023	Math	—Unverified	0	0
Language Models with Conformal Factuality Guarantees	Feb 15, 2024	Conformal PredictionLanguage Modeling	—Unverified	0	0
TinyGSM: achieving >80% on GSM8k with small language models	Dec 14, 2023	Arithmetic ReasoningGSM8K	—Unverified	0	0
YODA: Teacher-Student Progressive Learning for Language Models	Jan 28, 2024	GSM8KMath	—Unverified	0	0
Large Language Models Are Struggle to Cope with Unreasonability in Math Problems	Mar 28, 2024	Math	—Unverified	0	0
Large Language Models as Analogical Reasoners	Oct 3, 2023	Code GenerationGSM8K	—Unverified	0	0
1bit-Merging: Dynamic Quantized Merging for Large Language Models	Feb 15, 2025	Code GenerationMath	—Unverified	0	0
Large Language Models Can Self-Correct with Key Condition Verification	May 23, 2024	Arithmetic ReasoningMath	—Unverified	0	0
Large Language Models for Mathematical Reasoning: Progresses and Challenges	Jan 31, 2024	DiversityMath	—Unverified	0	0
Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions	Aug 16, 2024	DescriptiveHallucination	—Unverified	0	0
Large Language Models' Understanding of Math: Source Criticism and Extrapolation	Nov 12, 2023	Automated Theorem ProvingMath	—Unverified	0	0
Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving	Jan 28, 2025	MathMathematical Problem-Solving	—Unverified	0	0
Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH	Jan 30, 2025	Language ModelingLanguage Modelling	—Unverified	0	0
LaRS: Latent Reasoning Skills for Chain-of-Thought Reasoning	Dec 7, 2023	In-Context LearningMath	—Unverified	0	0
Benchmarking and Improving Generator-Validator Consistency of Language Models	Oct 3, 2023	BenchmarkingInstruction Following	—Unverified	0	0
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models	Oct 2, 2024	Cross-Lingual TransferMath	—Unverified	0	0
Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks	Mar 14, 2024	MathSkill Generalization	—Unverified	0	0
BeamLoRA: Beam-Constraint Low-Rank Adaptation	Feb 19, 2025	Code GenerationMath	—Unverified	0	0
Basic concepts, definitions, and methods in D number theory	Mar 21, 2020	Math	—Unverified	0	0
Lean-ing on Quality: How High-Quality Data Beats Diverse Multilingual Data in AutoFormalization	Feb 18, 2025	Math	—Unverified	0	0
Backward bifurcation and saddle-node bifurcation in virus-immune dynamics	Dec 1, 2021	Math	—Unverified	0	0
Learning Autonomous Code Integration for Math Language Models	Feb 2, 2025	Math	—Unverified	0	0
Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs	May 24, 2024	In-Context LearningLanguage Modeling	—Unverified	0	0
Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval	Nov 25, 2024	MathMath Word Problem Solving	—Unverified	0	0
Token-Supervised Value Models for Enhancing Mathematical Reasoning Capabilities of Large Language Models	Jul 12, 2024	GSM8KMath	—Unverified	0	0

Show:10 25 50

← PrevPage 62 of 64Next →

No leaderboard results yet.