SOTAVerified

Multi-task Language Understanding

The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. https://arxiv.org/pdf/2009.03300.pdf

Papers

Showing 11 of 1 papers

TitleStatusHype
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement LearningCode15
Show:102550

No leaderboard results yet.