Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 601–650 of 1596 papers

Title	Date	Tasks	Status	Hype
Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning	Oct 16, 2024	AllGSM8K	CodeCode Available	0
LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks	Oct 16, 2024	Mathparameter-efficient fine-tuning	CodeCode Available	1
When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems	Oct 16, 2024	HallucinationMath	—Unverified	0
JudgeBench: A Benchmark for Evaluating LLM-based Judges	Oct 16, 2024	Math	CodeCode Available	2
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs	Oct 15, 2024	GSM8KMath	—Unverified	0
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling	Oct 15, 2024	Instruction FollowingKnowledge Distillation	—Unverified	0
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks	Oct 14, 2024	FairnessGSM8K	CodeCode Available	0
Embedding Self-Correction as an Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning	Oct 14, 2024	MathMathematical Reasoning	—Unverified	0
CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning	Oct 14, 2024	MathMathematical Reasoning	CodeCode Available	1
Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps	Oct 14, 2024	Math	—Unverified	0
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces	Oct 13, 2024	Computational EfficiencyMath	—Unverified	0
HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics	Oct 13, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Expanding Search Space with Diverse Prompting Agents: An Efficient Sampling Approach for LLM Mathematical Reasoning	Oct 13, 2024	MathMathematical Reasoning	—Unverified	0
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models	Oct 12, 2024	Mathreinforcement-learning	CodeCode Available	5
Testing GPT-4-o1-preview on math and science problems: A follow-up study	Oct 11, 2024	MathSpatial Reasoning	—Unverified	0
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization	Oct 11, 2024	GSM8KLanguage Modeling	CodeCode Available	2
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights	Oct 11, 2024	GSM8KMath	CodeCode Available	4
The Geometry of Concepts: Sparse Autoencoder Feature Structure	Oct 10, 2024	Math	CodeCode Available	1
VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models	Oct 10, 2024	Math	CodeCode Available	2
Cognitive Noise and Altruistic Preferences	Oct 10, 2024	Math	—Unverified	0
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models	Oct 10, 2024	GSM8KMath	CodeCode Available	2
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code	Oct 10, 2024	MathMathematical Reasoning	CodeCode Available	2
Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models	Oct 10, 2024	Arithmetic ReasoningMath	CodeCode Available	0
Herald: A Natural Language Annotated Lean 4 Dataset	Oct 9, 2024	MathMathematical Reasoning	—Unverified	0
Hallucinating AI Hijacking Attack: Large Language Models and Malicious Code Recommenders	Oct 9, 2024	Math	—Unverified	0
Subtle Errors Matter: Preference Learning via Error-injected Self-editing	Oct 9, 2024	GSM8KMath	—Unverified	0
O1 Replication Journey: A Strategic Progress Report -- Part 1	Oct 8, 2024	Mathscientific discovery	CodeCode Available	7
Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning	Oct 8, 2024	Image RetrievalMath	—Unverified	0
Solving Functional Optimization with Deep Networks and Variational Principles	Oct 8, 2024	Math	—Unverified	0
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback	Oct 8, 2024	MathSequential Decision Making	CodeCode Available	1
Give me a hint: Can LLMs take a hint to solve math problems?	Oct 8, 2024	Adversarial RobustnessMath	CodeCode Available	0
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning	Oct 8, 2024	GSM8KHallucination	—Unverified	0
Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths	Oct 7, 2024	AttributeGSM8K	—Unverified	0
fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models	Oct 7, 2024	Math	—Unverified	0
Rule-based Data Selection for Large Language Models	Oct 7, 2024	BenchmarkingMath	—Unverified	0
Intriguing Properties of Large Language and Vision Models	Oct 7, 2024	cross-modal alignmentLarge Language Model	—Unverified	0
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification	Oct 5, 2024	GSM8KMath	—Unverified	0
BloomWise: Enhancing Problem-Solving capabilities of Large Language Models using Bloom's-Taxonomy-Inspired Prompts	Oct 5, 2024	Math	—Unverified	0
Steering Large Language Models between Code Execution and Textual Reasoning	Oct 4, 2024	Code GenerationMath	CodeCode Available	2
Deliberate Reasoning for LLMs as Structure-aware Planning with Accurate World Model	Oct 4, 2024	DiversityLogical Reasoning	—Unverified	0
Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure	Oct 3, 2024	Math	CodeCode Available	0
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models	Oct 3, 2024	AllLanguage Modeling	—Unverified	0
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning	Oct 3, 2024	GSM8KLanguage Modeling	—Unverified	0
Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection	Oct 3, 2024	Mathparameter-efficient fine-tuning	CodeCode Available	0
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation	Oct 3, 2024	GSM8KMath	—Unverified	0
Deep Knowledge Tracing for Personalized Adaptive Learning at Historically Black Colleges and Universities	Oct 2, 2024	Knowledge TracingMath	—Unverified	0
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo	Oct 2, 2024	Math	—Unverified	0
PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation	Oct 2, 2024	Data AugmentationDiversity	—Unverified	0
An Exploration of Self-Supervised Mutual Information Alignment for Multi-Task Settings	Oct 2, 2024	8kMath	CodeCode Available	0
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks	Oct 2, 2024	MathNavigate	—Unverified	0

Show:10 25 50

← PrevPage 13 of 32Next →

No leaderboard results yet.