SOTAVerified

Multi-task Language Understanding

The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. https://arxiv.org/pdf/2009.03300.pdf

Papers

Showing 5157 of 57 papers

TitleStatusHype
The Falcon Series of Open Language Models0
Claude 3.5 Sonnet Model Card Addendum0
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM0
Transcending Scaling Laws with 0.1% Extra Compute0
Model Card and Evaluations for Claude Models0
Orca 2: Teaching Small Language Models How to Reason0
PaLM 2 Technical Report0
Show:102550
← PrevPage 6 of 6Next →

No leaderboard results yet.