SOTAVerified

Math

Papers

Showing 751760 of 1596 papers

TitleStatusHype
TurkishMMLU: Measuring Massive Multitask Language Understanding in TurkishCode1
A LLM Benchmark based on the Minecraft Builder Dialog Agent Task0
Reasoning with Large Language Models, a Survey0
CCoE: A Compact LLM with Collaboration of Experts0
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization ModelingCode1
Token-Supervised Value Models for Enhancing Mathematical Reasoning Capabilities of Large Language Models0
Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model TutorsCode0
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models0
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data EngineCode4
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist0
Show:102550
← PrevPage 76 of 160Next →

No leaderboard results yet.