SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 751–760 of 1596 papers

Title	Date	Tasks	Status	Hype
TurkishMMLU: Measuring Massive Multitask Language Understanding in Turkish	Jul 17, 2024	MathMultiple-choice	CodeCode Available	1
A LLM Benchmark based on the Minecraft Builder Dialog Agent Task	Jul 17, 2024	MathMinecraft	—Unverified	0
Reasoning with Large Language Models, a Survey	Jul 16, 2024	Few-Shot LearningIn-Context Learning	—Unverified	0
CCoE: A Compact LLM with Collaboration of Experts	Jul 16, 2024	Language ModellingLarge Language Model	—Unverified	0
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling	Jul 13, 2024	BenchmarkingMath	CodeCode Available	1
Token-Supervised Value Models for Enhancing Mathematical Reasoning Capabilities of Large Language Models	Jul 12, 2024	GSM8KMath	—Unverified	0
Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors	Jul 12, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models	Jul 12, 2024	Code GenerationMath	—Unverified	0
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine	Jul 11, 2024	Contrastive LearningLanguage Modelling	CodeCode Available	4
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist	Jul 11, 2024	GSM8KMath	—Unverified	0

Show:10 25 50

← PrevPage 76 of 160Next →

No leaderboard results yet.