SOTAVerified|Agents Browse Leaderboard About Blog

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 81–90 of 1596 papers

Title	Date	Tasks	Status	Hype
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities	Aug 1, 2024	MathMM-Vet	CodeCode Available	3
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling	Jul 31, 2024	GSM8KMath	CodeCode Available	3
TaskGen: A Task-Based, Memory-Infused Agentic Framework using StrictJSON	Jul 22, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs	Jun 26, 2024	Arithmetic ReasoningGSM8K	CodeCode Available	3
Step-level Value Preference Optimization for Mathematical Reasoning	Jun 16, 2024	Learning-To-RankMath	CodeCode Available	3
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models	Jun 13, 2024	Mathobject-detection	CodeCode Available	3
MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning	Jun 13, 2024	Instruction FollowingMath	CodeCode Available	3
MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning	May 13, 2024	Data AugmentationGSM8K	CodeCode Available	3
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning	May 1, 2024	ARCGSM8K	CodeCode Available	3
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding	Apr 25, 2024	GSM8KHellaSwag	CodeCode Available	3

Show:10 25 50

← PrevPage 9 of 160Next →

No leaderboard results yet.