SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 281–290 of 1596 papers

Title	Date	Tasks	Status	Hype
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression	Apr 10, 2025	MathMMLU	CodeCode Available	1
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models	Apr 8, 2025	MathMultimodal Reasoning	CodeCode Available	1
Large (Vision) Language Models are Unsupervised In-Context Learners	Apr 3, 2025	GSM8KIn-Context Learning	CodeCode Available	1
BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing	Apr 2, 2025	3D ReconstructionBenchmarking	CodeCode Available	1
Entropy-Based Adaptive Weighting for Self-Training	Mar 31, 2025	GSM8KMath	CodeCode Available	1
QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks?	Mar 28, 2025	Logical ReasoningMath	CodeCode Available	1
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models	Mar 27, 2025	Math	CodeCode Available	1
LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation	Mar 25, 2025	Code CompletionLanguage Modeling	CodeCode Available	1
EXAONE Deep: Reasoning Enhanced Language Models	Mar 16, 2025	Math	CodeCode Available	1
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search	Mar 13, 2025	Image RetrievalMath	CodeCode Available	1

Show:10 25 50

← PrevPage 29 of 160Next →

No leaderboard results yet.