SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 791–800 of 1596 papers

Title	Date	Tasks	Status	Hype
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving	Jun 18, 2024	Arithmetic ReasoningMath	CodeCode Available	2
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools	Jun 18, 2024	AllGSM8K	CodeCode Available	14
Navigating the Labyrinth: Evaluating and Enhancing LLMs' Ability to Reason About Search Problems	Jun 18, 2024	In-Context LearningMath	—Unverified	0
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles	Jun 18, 2024	Arithmetic ReasoningCode Generation	CodeCode Available	1
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts	Jun 17, 2024	Math	—Unverified	0
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling	Jun 17, 2024	GSM8KMath	CodeCode Available	1
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence	Jun 17, 2024	16kLanguage Modeling	CodeCode Available	9
GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation	Jun 17, 2024	Image GenerationMath	CodeCode Available	0
Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment	Jun 17, 2024	Logical ReasoningMath	—Unverified	0
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning	Jun 16, 2024	BenchmarkingMath	—Unverified	0

Show:10 25 50

← PrevPage 80 of 160Next →

No leaderboard results yet.