SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 771–780 of 1596 papers

Title	Date	Tasks	Status	Hype
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning	Jun 30, 2024	GSM8KMath	CodeCode Available	1
Advancing Process Verification for Large Language Models via Tree-Based Preference Learning	Jun 29, 2024	Binary ClassificationGSM8K	—Unverified	0
CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for Foundation Models	Jun 28, 2024	DiversityMath	—Unverified	0
ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting	Jun 28, 2024	Bilevel OptimizationInstruction Following	—Unverified	0
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions	Jun 27, 2024	Distractor GenerationMath	CodeCode Available	0
LiveBench: A Challenging, Contamination-Limited LLM Benchmark	Jun 27, 2024	ArticlesInstruction Following	CodeCode Available	5
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs	Jun 26, 2024	Arithmetic ReasoningGSM8K	CodeCode Available	3
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data	Jun 26, 2024	BenchmarkingMath	CodeCode Available	2
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models	Jun 25, 2024	DiversityMath	CodeCode Available	2
Task Oriented In-Domain Data Augmentation	Jun 24, 2024	Data AugmentationMath	—Unverified	0

Show:10 25 50

← PrevPage 78 of 160Next →

No leaderboard results yet.