Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1501–1550 of 1596 papers

Title	Date	Tasks	Status
Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination	Jun 10, 2023	MathMathematical Reasoning	—Unverified
Investigating the Efficacy of Large Language Models in Reflective Assessment Methods through Chain of Thoughts Prompting	Sep 30, 2023	Math	—Unverified
Thinking Outside the (Gray) Box: A Context-Based Score for Assessing Value and Originality in Neural Text Generation	Feb 18, 2025	DiversityMath	—Unverified
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations	Apr 1, 2024	BenchmarkingMath	—Unverified
Solving Functional Optimization with Deep Networks and Variational Principles	Oct 8, 2024	Math	—Unverified
Is your LLM trapped in a Mental Set? Investigative study on how mental sets affect the reasoning capabilities of LLMs	Jan 21, 2025	GSM8KIn-Context Learning	—Unverified
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist	Jul 11, 2024	GSM8KMath	—Unverified
Iterative Reasoning Preference Optimization	Apr 30, 2024	ARCGSM8K	—Unverified
Yi-Lightning Technical Report	Dec 2, 2024	ChatbotLarge Language Model	—Unverified
Adaptive Guidance Accelerates Reinforcement Learning of Reasoning Models	Jun 16, 2025	Mathreinforcement-learning	—Unverified
JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation	Oct 22, 2024	Math	—Unverified
Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning	Oct 8, 2024	Image RetrievalMath	—Unverified
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking	Mar 25, 2025	MathReinforcement Learning (RL)	—Unverified
Kappa Learning: A New Method for Measuring Similarity Between Educational Items Using Performance Data	Dec 20, 2018	ClusteringMath	—Unverified
Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning	Mar 4, 2024	GSM8KMath	—Unverified
Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities	May 21, 2025	MathReinforcement Learning (RL)	—Unverified
Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains	Jun 2, 2025	MathReinforcement Learning (RL)	—Unverified
Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever	Jun 19, 2024	MathSemantic Similarity	—Unverified
Knowledge Tagging with Large Language Model based Multi-Agent System	Sep 12, 2024	Language ModelingLanguage Modelling	—Unverified
Kokoyi: Executable LaTeX for End-to-end Deep Learning	Sep 29, 2021	Deep LearningMath	—Unverified
L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models	Sep 29, 2023	Code GenerationMath	—Unverified
Better Process Supervision with Bi-directional Rewarding Signals	Mar 6, 2025	Language ModelingLanguage Modelling	—Unverified
Adapting the LodView RDF Browser for Navigation over the Multilingual Linguistic Linked Open Data Cloud	Aug 28, 2022	Math	—Unverified
Benchmarking Reasoning Robustness in Large Language Models	Mar 6, 2025	BenchmarkingMath	—Unverified
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models	Apr 17, 2025	BenchmarkingMath	—Unverified
Tighter 'uniform bounds for Black-Scholes implied volatility' and the applications to root-finding	Feb 17, 2023	Math	—Unverified
Language Models with Conformal Factuality Guarantees	Feb 15, 2024	Conformal PredictionLanguage Modeling	—Unverified
TinyGSM: achieving >80% on GSM8k with small language models	Dec 14, 2023	Arithmetic ReasoningGSM8K	—Unverified
YODA: Teacher-Student Progressive Learning for Language Models	Jan 28, 2024	GSM8KMath	—Unverified
Large Language Models Are Struggle to Cope with Unreasonability in Math Problems	Mar 28, 2024	Math	—Unverified
Large Language Models as Analogical Reasoners	Oct 3, 2023	Code GenerationGSM8K	—Unverified
1bit-Merging: Dynamic Quantized Merging for Large Language Models	Feb 15, 2025	Code GenerationMath	—Unverified
Large Language Models Can Self-Correct with Key Condition Verification	May 23, 2024	Arithmetic ReasoningMath	—Unverified
Large Language Models for Mathematical Reasoning: Progresses and Challenges	Jan 31, 2024	DiversityMath	—Unverified
Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions	Aug 16, 2024	DescriptiveHallucination	—Unverified
Large Language Models' Understanding of Math: Source Criticism and Extrapolation	Nov 12, 2023	Automated Theorem ProvingMath	—Unverified
Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving	Jan 28, 2025	MathMathematical Problem-Solving	—Unverified
Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH	Jan 30, 2025	Language ModelingLanguage Modelling	—Unverified
LaRS: Latent Reasoning Skills for Chain-of-Thought Reasoning	Dec 7, 2023	In-Context LearningMath	—Unverified
Benchmarking and Improving Generator-Validator Consistency of Language Models	Oct 3, 2023	BenchmarkingInstruction Following	—Unverified
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models	Oct 2, 2024	Cross-Lingual TransferMath	—Unverified
Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks	Mar 14, 2024	MathSkill Generalization	—Unverified
BeamLoRA: Beam-Constraint Low-Rank Adaptation	Feb 19, 2025	Code GenerationMath	—Unverified
Basic concepts, definitions, and methods in D number theory	Mar 21, 2020	Math	—Unverified
Lean-ing on Quality: How High-Quality Data Beats Diverse Multilingual Data in AutoFormalization	Feb 18, 2025	Math	—Unverified
Backward bifurcation and saddle-node bifurcation in virus-immune dynamics	Dec 1, 2021	Math	—Unverified
Learning Autonomous Code Integration for Math Language Models	Feb 2, 2025	Math	—Unverified
Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs	May 24, 2024	In-Context LearningLanguage Modeling	—Unverified
Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval	Nov 25, 2024	MathMath Word Problem Solving	—Unverified
Token-Supervised Value Models for Enhancing Mathematical Reasoning Capabilities of Large Language Models	Jul 12, 2024	GSM8KMath	—Unverified

Show:10 25 50

← PrevPage 31 of 32Next →

No leaderboard results yet.