Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 576–600 of 1596 papers

Title	Date	Tasks	Status	Hype
Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks?	Oct 27, 2024	Data AugmentationMath	CodeCode Available	0
Library Learning Doesn't: The Curious Case of the Single-Use "Library"	Oct 26, 2024	MathMathematical Reasoning	CodeCode Available	0
Can Stories Help LLMs Reason? Curating Information Space Through Narrative	Oct 25, 2024	Math	—Unverified	0
Mixture of Parrots: Experts improve memorization more than reasoning	Oct 24, 2024	MathMemorization	—Unverified	0
ReasonAgain: Using Extractable Symbolic Programs to Evaluate Mathematical Reasoning	Oct 24, 2024	GSM8KMath	—Unverified	0
Scaling up Masked Diffusion Models on Text	Oct 24, 2024	GSM8KLanguage Modeling	CodeCode Available	3
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch	Oct 24, 2024	MathMathematical Reasoning	CodeCode Available	2
From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems	Oct 24, 2024	BenchmarkingCommon Sense Reasoning	—Unverified	0
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning	Oct 23, 2024	MathMixture-of-Experts	—Unverified	0
Optimizing Chain-of-Thought Reasoning: Tackling Arranging Bottleneck via Plan Augmentation	Oct 22, 2024	GSM8KMath	—Unverified	0
Non-myopic Generation of Language Models for Reasoning and Planning	Oct 22, 2024	Computational EfficiencyLanguage Modelling	CodeCode Available	1
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes	Oct 22, 2024	GSM8KLanguage Modeling	CodeCode Available	1
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration	Oct 22, 2024	Math	—Unverified	0
Polyak's Heavy Ball Method Achieves Accelerated Local Rate of Convergence under Polyak-Lojasiewicz Inequality	Oct 22, 2024	Math	—Unverified	0
JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation	Oct 22, 2024	Math	—Unverified	0
PromptHive: Bringing Subject Matter Experts Back to the Forefront with Collaborative Prompt Engineering for Educational Content Creation	Oct 21, 2024	MathPrompt Engineering	—Unverified	0
No more hard prompts: SoftSRV prompting for synthetic data generation	Oct 21, 2024	Language ModelingLanguage Modelling	—Unverified	0
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems	Oct 21, 2024	Automated Theorem ProvingCPU	CodeCode Available	4
Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology	Oct 19, 2024	Logical ReasoningMath	—Unverified	0
On Designing Effective RL Reward at Training Time for LLM Reasoning	Oct 19, 2024	GSM8KMath	—Unverified	0
Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning	Oct 18, 2024	MathMathematical Reasoning	—Unverified	0
Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens	Oct 18, 2024	MathQuestion Answering	—Unverified	0
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems	Oct 18, 2024	In-Context LearningMath	—Unverified	0
SBI-RAG: Enhancing Math Word Problem Solving for Students through Schema-Based Instruction and Retrieval-Augmented Generation	Oct 17, 2024	GSM8KLanguage Modeling	CodeCode Available	0
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model	Oct 17, 2024	Math	CodeCode Available	2

Show:10 25 50

← PrevPage 24 of 64Next →

No leaderboard results yet.