Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 551–600 of 1596 papers

Title	Date	Tasks	Status	Hype
Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus	Nov 19, 2024	Formal LogicLogical Reasoning	CodeCode Available	2
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues	Nov 19, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
MM-Eval: A Hierarchical Benchmark for Modern Mongolian Evaluation in LLMs	Nov 14, 2024	General KnowledgeMath	CodeCode Available	0
RESOLVE: Relational Reasoning with Symbolic and Object-Level Features Using Vector Symbolic Processing	Nov 13, 2024	DecoderMath	CodeCode Available	0
What Do Learning Dynamics Reveal About Generalization in LLM Reasoning?	Nov 12, 2024	GSM8KMath	CodeCode Available	1
Problem-Oriented Segmentation and Retrieval: Case Study on Tutoring Conversations	Nov 12, 2024	MathRetrieval	CodeCode Available	1
UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts	Nov 11, 2024	Code GenerationGSM8K	CodeCode Available	1
OpenAI-o1 AB Testing: Does the o1 model really do good reasoning in math problem solving?	Nov 9, 2024	Logical ReasoningMath	—Unverified	0
VISTA: Visual Integrated System for Tailored Automation in Math Problem Generation Using LLM	Nov 8, 2024	Math	—Unverified	0
Aioli: A Unified Optimization Framework for Language Model Data Mixing	Nov 8, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams	Nov 7, 2024	Math	—Unverified	0
Meta-Reasoning Improves Tool Use in Large Language Models	Nov 7, 2024	Math	CodeCode Available	0
Self-Consistency Preference Optimization	Nov 6, 2024	GSM8KMath	—Unverified	0
Automatic Generation of Question Hints for Mathematics Problems using Large Language Models in Educational Technology	Nov 5, 2024	MathMisconceptions	—Unverified	0
Leveraging Label Semantics and Meta-Label Refinement for Multi-Label Question Classification	Nov 4, 2024	MathReranking	CodeCode Available	0
Regress, Don't Guess -- A Regression-like Loss on Number Tokens for Language Models	Nov 4, 2024	Inductive BiasLanguage Modeling	CodeCode Available	1
Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models	Nov 2, 2024	GSM8KMath	—Unverified	0
STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing	Nov 1, 2024	2kIn-Context Learning	—Unverified	0
DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models	Oct 29, 2024	MathMathematical Reasoning	—Unverified	0
Improving Math Problem Solving in Large Language Models Through Categorization and Strategy Tailoring	Oct 29, 2024	Math	—Unverified	0
Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses	Oct 29, 2024	MathZero-Shot Learning	—Unverified	0
Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency	Oct 28, 2024	Math	CodeCode Available	1
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics	Oct 28, 2024	Arithmetic ReasoningMath	CodeCode Available	1
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation	Oct 28, 2024	ARCMath	—Unverified	0
Flaming-hot Initiation with Regular Execution Sampling for Large Language Models	Oct 28, 2024	DiversityMath	CodeCode Available	2
Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks?	Oct 27, 2024	Data AugmentationMath	CodeCode Available	0
Library Learning Doesn't: The Curious Case of the Single-Use "Library"	Oct 26, 2024	MathMathematical Reasoning	CodeCode Available	0
Can Stories Help LLMs Reason? Curating Information Space Through Narrative	Oct 25, 2024	Math	—Unverified	0
Mixture of Parrots: Experts improve memorization more than reasoning	Oct 24, 2024	MathMemorization	—Unverified	0
ReasonAgain: Using Extractable Symbolic Programs to Evaluate Mathematical Reasoning	Oct 24, 2024	GSM8KMath	—Unverified	0
Scaling up Masked Diffusion Models on Text	Oct 24, 2024	GSM8KLanguage Modeling	CodeCode Available	3
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch	Oct 24, 2024	MathMathematical Reasoning	CodeCode Available	2
From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems	Oct 24, 2024	BenchmarkingCommon Sense Reasoning	—Unverified	0
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning	Oct 23, 2024	MathMixture-of-Experts	—Unverified	0
Optimizing Chain-of-Thought Reasoning: Tackling Arranging Bottleneck via Plan Augmentation	Oct 22, 2024	GSM8KMath	—Unverified	0
Non-myopic Generation of Language Models for Reasoning and Planning	Oct 22, 2024	Computational EfficiencyLanguage Modelling	CodeCode Available	1
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes	Oct 22, 2024	GSM8KLanguage Modeling	CodeCode Available	1
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration	Oct 22, 2024	Math	—Unverified	0
Polyak's Heavy Ball Method Achieves Accelerated Local Rate of Convergence under Polyak-Lojasiewicz Inequality	Oct 22, 2024	Math	—Unverified	0
JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation	Oct 22, 2024	Math	—Unverified	0
PromptHive: Bringing Subject Matter Experts Back to the Forefront with Collaborative Prompt Engineering for Educational Content Creation	Oct 21, 2024	MathPrompt Engineering	—Unverified	0
No more hard prompts: SoftSRV prompting for synthetic data generation	Oct 21, 2024	Language ModelingLanguage Modelling	—Unverified	0
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems	Oct 21, 2024	Automated Theorem ProvingCPU	CodeCode Available	4
Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology	Oct 19, 2024	Logical ReasoningMath	—Unverified	0
On Designing Effective RL Reward at Training Time for LLM Reasoning	Oct 19, 2024	GSM8KMath	—Unverified	0
Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning	Oct 18, 2024	MathMathematical Reasoning	—Unverified	0
Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens	Oct 18, 2024	MathQuestion Answering	—Unverified	0
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems	Oct 18, 2024	In-Context LearningMath	—Unverified	0
SBI-RAG: Enhancing Math Word Problem Solving for Students through Schema-Based Instruction and Retrieval-Augmented Generation	Oct 17, 2024	GSM8KLanguage Modeling	CodeCode Available	0
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model	Oct 17, 2024	Math	CodeCode Available	2

Show:10 25 50

← PrevPage 12 of 32Next →

No leaderboard results yet.