Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1351–1400 of 1596 papers

Title	Date	Tasks	Status
The Hallucination Tax of Reinforcement Finetuning	May 20, 2025	HallucinationMath	—Unverified
Explaining Math Word Problem Solvers	Jul 24, 2023	Math	—Unverified
Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation	Apr 4, 2025	MathMathematical Reasoning	—Unverified
Explanation Generation for a Math Word Problem Solver	Oct 1, 2015	Explanation GenerationMath	—Unverified
Explicit Knowledge Transfer for Weakly-Supervised Code Generation	Nov 30, 2022	Code GenerationFew-Shot Learning	—Unverified
Exploring Educational Equity: A Machine Learning Approach to Unravel Achievement Disparities in Georgia	Jan 25, 2024	Math	—Unverified
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate	May 22, 2023	BenchmarkingMath	—Unverified
Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them	Mar 20, 2025	MathMemorization	—Unverified
Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases	Mar 26, 2023	Math	—Unverified
Calculus on MDPs: Potential Shaping as a Gradient	Aug 20, 2022	Math	—Unverified
Exploring the Mystery of Influential Data for Mathematical Reasoning	Apr 1, 2024	MathMathematical Reasoning	—Unverified
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning	Jun 16, 2024	BenchmarkingMath	—Unverified
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity	Jun 7, 2025	Math	—Unverified
Extracting the Unknown from Long Math Problems	Mar 22, 2021	Math	—Unverified
Fairness Hub Technical Briefs: AUC Gap	Sep 20, 2023	FairnessMath	—Unverified
Fairshare Data Pricing via Data Valuation for Large Language Models	Jan 31, 2025	Data ValuationMath	—Unverified
FANS -- Formal Answer Selection for Natural Language Math Reasoning Using Lean4	Mar 5, 2025	Answer SelectionMath	—Unverified
BurTorch: Revisiting Training from First Principles by Coupling Autodiff, Math Optimization, and Systems	Mar 18, 2025	CPUMath	—Unverified
Fast Diffusion Inhibits Disease Outbreaks	Jul 29, 2019	Math	—Unverified
Faster and Better LLMs via Latency-Aware Test-Time Scaling	May 26, 2025	Math	—Unverified
Feature Selection Based on Confidence Machine	Oct 20, 2014	feature selectionMath	—Unverified
The Impact of Item-Writing Flaws on Difficulty and Discrimination in Item Response Theory	Mar 13, 2025	MathMultiple-choice	—Unverified
Few-Shot Recalibration of Language Models	Mar 27, 2024	MathMMLU	—Unverified
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning	Oct 8, 2024	GSM8KHallucination	—Unverified
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models	Mar 12, 2024	MathMathematical Reasoning	—Unverified
The Invalsi Benchmarks: measuring Linguistic and Mathematical understanding of Large Language Models in Italian	Mar 27, 2024	Language ModellingMath	—Unverified
Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models	Jun 16, 2025	Math	—Unverified
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning	Nov 14, 2023	GSM8KMath	—Unverified
Fixation probabilities for the Moran process in evolutionary games with two strategies: graph shapes and large population asymptotics	Apr 30, 2018	Math	—Unverified
Fixation probabilities for the Moran process with three or more strategies: general and coupling results	Nov 23, 2018	Math	—Unverified
Building Math Agents with Multi-Turn Iterative Preference Learning	Sep 4, 2024	GSM8KMath	—Unverified
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration	Oct 22, 2024	Math	—Unverified
The Logic of Political Survival Revisited: Consequences of Elite Uncertainty Under Authoritarian Rule	Aug 4, 2024	Math	—Unverified
Formal Mathematical Reasoning: A New Frontier in AI	Dec 20, 2024	Automated Theorem ProvingMath	—Unverified
The Long-Term Effects of Teachers' Gender Stereotypes	Dec 16, 2022	Math	—Unverified
fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models	Oct 7, 2024	Math	—Unverified
FRACTAL: Fine-Grained Scoring from Aggregate Text Labels	Apr 7, 2024	MathMultiple Instance Learning	—Unverified
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning	Jan 31, 2025	Language ModelingLanguage Modelling	—Unverified
From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems	Oct 24, 2024	BenchmarkingCommon Sense Reasoning	—Unverified
From fixation probabilities to d-player games: an inverse problem in evolutionary dynamics	Nov 20, 2018	MathUnity	—Unverified
The Mathematics of Market Timing	Dec 13, 2017	Math	—Unverified
From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting	Dec 18, 2023	DiversityGSM8K	—Unverified
From Large to Tiny: Distilling and Refining Mathematical Expertise for Math Word Problems with Weakly Supervision	Mar 21, 2024	Math	—Unverified
From Textbooks to Knowledge: A Case Study in Harvesting Axiomatic Knowledge from Textbooks to Solve Geometry Problems	Sep 1, 2017	MathQuestion Answering	—Unverified
From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics	Mar 10, 2025	MathQuestion Answering	—Unverified
Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens	Oct 18, 2024	MathQuestion Answering	—Unverified
Bridging Offline and Online Reinforcement Learning for LLMs	Jun 26, 2025	Instruction FollowingMath	—Unverified
Breaking Ties: Regression Discontinuity Design Meets Market Design	Dec 31, 2020	Mathregression	—Unverified
Gamifying Math Education using Object Detection	Apr 13, 2023	MathObject	—Unverified
GAPS: Geometry-Aware Problem Solver	Jan 29, 2024	Geometry Problem SolvingMath	—Unverified

Show:10 25 50

← PrevPage 28 of 32Next →

No leaderboard results yet.