SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 626–650 of 1596 papers

Title	Date	Tasks	Status	Hype
Subtle Errors Matter: Preference Learning via Error-injected Self-editing	Oct 9, 2024	GSM8KMath	—Unverified	0
O1 Replication Journey: A Strategic Progress Report -- Part 1	Oct 8, 2024	Mathscientific discovery	CodeCode Available	7
Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning	Oct 8, 2024	Image RetrievalMath	—Unverified	0
Solving Functional Optimization with Deep Networks and Variational Principles	Oct 8, 2024	Math	—Unverified	0
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback	Oct 8, 2024	MathSequential Decision Making	CodeCode Available	1
Give me a hint: Can LLMs take a hint to solve math problems?	Oct 8, 2024	Adversarial RobustnessMath	CodeCode Available	0
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning	Oct 8, 2024	GSM8KHallucination	—Unverified	0
Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths	Oct 7, 2024	AttributeGSM8K	—Unverified	0
fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models	Oct 7, 2024	Math	—Unverified	0
Rule-based Data Selection for Large Language Models	Oct 7, 2024	BenchmarkingMath	—Unverified	0
Intriguing Properties of Large Language and Vision Models	Oct 7, 2024	cross-modal alignmentLarge Language Model	—Unverified	0
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification	Oct 5, 2024	GSM8KMath	—Unverified	0
BloomWise: Enhancing Problem-Solving capabilities of Large Language Models using Bloom's-Taxonomy-Inspired Prompts	Oct 5, 2024	Math	—Unverified	0
Steering Large Language Models between Code Execution and Textual Reasoning	Oct 4, 2024	Code GenerationMath	CodeCode Available	2
Deliberate Reasoning for LLMs as Structure-aware Planning with Accurate World Model	Oct 4, 2024	DiversityLogical Reasoning	—Unverified	0
Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure	Oct 3, 2024	Math	CodeCode Available	0
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models	Oct 3, 2024	AllLanguage Modeling	—Unverified	0
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning	Oct 3, 2024	GSM8KLanguage Modeling	—Unverified	0
Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection	Oct 3, 2024	Mathparameter-efficient fine-tuning	CodeCode Available	0
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation	Oct 3, 2024	GSM8KMath	—Unverified	0
Deep Knowledge Tracing for Personalized Adaptive Learning at Historically Black Colleges and Universities	Oct 2, 2024	Knowledge TracingMath	—Unverified	0
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo	Oct 2, 2024	Math	—Unverified	0
PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation	Oct 2, 2024	Data AugmentationDiversity	—Unverified	0
An Exploration of Self-Supervised Mutual Information Alignment for Multi-Task Settings	Oct 2, 2024	8kMath	CodeCode Available	0
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks	Oct 2, 2024	MathNavigate	—Unverified	0

Show:10 25 50

← PrevPage 26 of 64Next →

No leaderboard results yet.