SOTAVerified
|
Agents
Browse
Leaderboard
About
Tasks
›
Math
Math
Papers
Recently Added
Most Hyped
Most Active
Needs Verification
Most Verified
Showing 791–800 of 1596 papers
Title
Date
Tasks
Status
Hype
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
Jun 18, 2024
Arithmetic Reasoning
Math
Code
Code Available
2
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Jun 18, 2024
All
GSM8K
Code
Code Available
14
Navigating the Labyrinth: Evaluating and Enhancing LLMs' Ability to Reason About Search Problems
Jun 18, 2024
In-Context Learning
Math
—
Unverified
0
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles
Jun 18, 2024
Arithmetic Reasoning
Code Generation
Code
Code Available
1
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
Jun 17, 2024
Math
—
Unverified
0
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Jun 17, 2024
GSM8K
Math
Code
Code Available
1
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Jun 17, 2024
16k
Language Modeling
Code
Code Available
9
GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation
Jun 17, 2024
Image Generation
Math
Code
Code Available
0
Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment
Jun 17, 2024
Logical Reasoning
Math
—
Unverified
0
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
Jun 16, 2024
Benchmarking
Math
—
Unverified
0
Show:
10
25
50
← Prev
Page 80 of 160
Next →
No leaderboard results yet.