GSM8K

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 376–400 of 439 papers

Title	Date	Tasks	Status
Prompt-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression	Mar 30, 2024	GSM8KRelation	—Unverified
Supervisory Prompt Training	Mar 26, 2024	GSM8KSentence	—Unverified
Self-Consistency Boosts Calibration for Math Reasoning	Mar 14, 2024	GSM8KMath	—Unverified
Prompt Selection and Augmentation for Few Examples Code Generation in Large Language Model and its Application in Robotics Control	Mar 11, 2024	Code GenerationDiversity	—Unverified
MathScale: Scaling Instruction Tuning for Mathematical Reasoning	Mar 5, 2024	GSM8KMath	CodeCode Available
Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning	Mar 4, 2024	GSM8KMath	—Unverified
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs	Feb 26, 2024	GSM8KMath	—Unverified
Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models	Feb 24, 2024	GSM8KMathematical Reasoning	—Unverified
Fine-Grained Self-Endorsement Improves Factuality and Reasoning	Feb 23, 2024	GSM8KLanguage Modeling	—Unverified
SymBa: Symbolic Backward Chaining for Structured Natural Language Reasoning	Feb 20, 2024	Arithmetic ReasoningGSM8K	—Unverified
Can Separators Improve Chain-of-Thought Prompting?	Feb 16, 2024	8kGSM8K	—Unverified
Orca-Math: Unlocking the potential of SLMs in Grade School Math	Feb 16, 2024	Arithmetic ReasoningGSM8K	—Unverified
Premise Order Matters in Reasoning with Large Language Models	Feb 14, 2024	GSM8KMathematical Problem-Solving	—Unverified
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements	Feb 13, 2024	GSM8KMath	—Unverified
The Unreasonable Effectiveness of Eccentric Automatic Prompts	Feb 9, 2024	Arithmetic ReasoningGSM8K	—Unverified
In-Context Principle Learning from Mistakes	Feb 8, 2024	GSM8KIn-Context Learning	CodeCode Available
RevOrder: A Novel Method for Enhanced Arithmetic in Language Models	Feb 6, 2024	GSM8KMath	—Unverified
Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision	Feb 5, 2024	GSM8KMath	—Unverified
YODA: Teacher-Student Progressive Learning for Language Models	Jan 28, 2024	GSM8KMath	—Unverified
Self-Imagine: Effective Unimodal Reasoning with Multimodal Models using Self-Imagination	Jan 16, 2024	GSM8KLanguage Modeling	—Unverified
Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities	Dec 22, 2023	ChatbotGSM8K	—Unverified
From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting	Dec 18, 2023	DiversityGSM8K	—Unverified
TinyGSM: achieving >80% on GSM8k with small language models	Dec 14, 2023	Arithmetic ReasoningGSM8K	—Unverified
Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning	Dec 14, 2023	Arithmetic ReasoningFew-Shot Learning	—Unverified
Training Chain-of-Thought via Latent-Variable Inference	Nov 28, 2023	GSM8K	—Unverified

Show:10 25 50

← PrevPage 16 of 18Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Xolver	Accuracy	98.1	—	Unverified
2	Orange-mini	0-shot MRR	98	—	Unverified