SOTAVerified

GSM8K

Papers

Showing 401439 of 439 papers

TitleStatusHype
MathAttack: Attacking Large Language Models Towards Math Solving Ability0
No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function0
AskIt: Unified Programming Interface for Programming with Large Language ModelsCode1
Exploring Equation as a Better Intermediate Meaning Representation for Numerical ReasoningCode0
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-InstructCode5
Scaling Relationship on Learning Mathematical Reasoning with Large Language ModelsCode2
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step ReasoningCode1
A mixed policy to improve performance of language models on math problemsCode0
DiversiGATE: A Comprehensive Framework for Reliable Large Language Models0
Interpretable Math Word Problem Solution Generation Via Step-by-step Planning0
Matrix Information Theory for Self-Supervised LearningCode1
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language ModelsCode3
GRACE: Discriminator-Guided Chain-of-Thought ReasoningCode1
Calc-X and Calcformers: Empowering Arithmetical Chain-of-Thought through Interaction with Symbolic SystemsCode0
Self-Polish: Enhance Reasoning in Large Language Models via Problem RefinementCode1
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuningCode0
Automatic Model Selection with Large Language Models for ReasoningCode1
RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought0
Hint of Thought prompting: an explainable and zero-shot approach to reasoning tasks with LLMs0
Self-Evaluation Guided Beam Search for Reasoning0
Progressive-Hint Prompting Improves Reasoning in Large Language ModelsCode2
Solving Math Word Problems by Combining Language Models With Symbolic SolversCode1
Boosted Prompt Ensembles for Large Language ModelsCode1
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context LearningCode1
Teaching Small Language Models to Reason0
Distilling Reasoning Capabilities into Smaller Language ModelsCode0
Explicit Knowledge Transfer for Weakly-Supervised Code Generation0
Solving math word problems with process- and outcome-based feedback0
PAL: Program-aided Language ModelsCode3
Large Language Models Can Self-Improve0
Transcending Scaling Laws with 0.1% Extra Compute0
Language Models are Multilingual Chain-of-Thought ReasonersCode2
Complexity-Based Prompting for Multi-Step Reasoning0
Making Large Language Models Better Reasoners with Step-Aware Verifier0
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct SolutionsCode1
Large Language Models are Zero-Shot ReasonersCode2
Self-Consistency Improves Chain of Thought Reasoning in Language ModelsCode1
Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsCode6
Training Verifiers to Solve Math Word ProblemsCode3
Show:102550
← PrevPage 9 of 9Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified