SOTAVerified

GSM8K

Papers

Showing 8190 of 439 papers

TitleStatusHype
Scaling Relationship on Learning Mathematical Reasoning with Large Language ModelsCode2
Language Models as Science TutorsCode1
Large Language Models are Contrastive ReasonersCode1
Multiple-Choice Questions are Efficient and Robust LLM EvaluatorsCode1
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context LearningCode1
Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt TemplatesCode1
Automatic Model Selection with Large Language Models for ReasoningCode1
Matrix Information Theory for Self-Supervised LearningCode1
CommVQ: Commutative Vector Quantization for KV Cache CompressionCode1
Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement LearningCode1
Show:102550
← PrevPage 9 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified