SOTAVerified

Math Word Problem Solving

A math word problem is a mathematical exercise (such as in a textbook, worksheet, or exam) where significant background information on the problem is presented in ordinary language rather than in mathematical notation. As most word problems involve a narrative of some sort, they are sometimes referred to as story problems and may vary in the amount of technical language used.

Papers

Showing 51100 of 107 papers

TitleStatusHype
RetICL: Sequential Retrieval of In-Context Examples with Reinforcement LearningCode1
Automatic Model Selection with Large Language Models for ReasoningCode1
Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation0
Progressive-Hint Prompting Improves Reasoning in Large Language ModelsCode2
Sparks of Artificial General Intelligence: Early experiments with GPT-4Code6
LLaMA: Open and Efficient Foundation Language ModelsCode7
Automatic Generation of Socratic Subquestions for Teaching Math Word ProblemsCode1
PAL: Program-aided Language ModelsCode3
Galactica: A Large Language Model for ScienceCode4
Multi-View Reasoning: Consistent Contrastive Learning for Math Word ProblemCode2
ELASTIC: Numerical Reasoning with Adaptive Symbolic CompilerCode1
Improving Compositional Generalization in Math Word Problem SolvingCode0
Solving Quantitative Reasoning Problems with Language ModelsCode2
Large Language Models are Zero-Shot ReasonersCode2
LogicSolver: Towards Interpretable Math Word Problem Solving with Logical Prompt-enhanced LearningCode0
EPT-X: An Expression-Pointer Transformer model that generates eXplanations for numbersCode0
Learning to Reason Deductively: Math Word Problem Solving as Complex Relation ExtractionCode1
MWP-BERT: Numeracy-Augmented Pre-training for Math Word Problem Solving0
Towards Interpretable Math Word Problem Solving with Grounded Linguistic Logic Reasoning0
An Edge-Enhanced Hierarchical Graph-to-Tree Network for Math Word Problem SolvingCode0
IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningCode1
Recall and Learn: A Memory-augmented Solver for Math Word ProblemsCode1
Adversarial Examples for Evaluating Math Word Problem SolversCode0
Generate & Rank: A Multi-task Framework for Math Word Problems0
MWPToolkit: An Open-Source Framework for Deep Learning-Based Math Word Problem SolversCode1
Math Word Problem Solving with Explicit Numerical ValuesCode1
MWP-BERT: Numeracy-Augmented Pre-training for Math Word Problem SolvingCode1
Are NLP Models really able to Solve Simple Math Word Problems?Code1
Measuring Mathematical Problem Solving With the MATH DatasetCode2
Generating Equation by Utilizing Operators : GEO model0
A Knowledge-Aware Sequence-to-Tree Network for Math Word Problem Solving0
Point to the Expression: Solving Algebraic Word Problems using the Expression-Pointer Transformer ModelCode0
Semantically-Aligned Universal Tree-Structured Solver for Math Word ProblemsCode1
Reverse Operation based Data Augmentation for Solving Math Word ProblemsCode0
Ape210K: A Large-Scale and Template-Rich Dataset of Math Word ProblemsCode1
A Chinese Math Word Problem Solving System Based on Linguistic Theory and Non-statistical Approach0
Graph-to-Tree Learning for Solving Math Word ProblemsCode1
DeBERTa: Decoding-enhanced BERT with Disentangled AttentionCode2
Graph-to-Tree Neural Networks for Learning Structured Input-Output Translation with Applications to Semantic Parsing and Math Word ProblemCode1
A Goal-Driven Tree-Structured Neural Model for Math Word ProblemsCode0
Modeling Intra-Relation in Math Word Problems with Different Functional Multi-Head AttentionsCode0
MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms0
Analysing Mathematical Reasoning Abilities of Neural ModelsCode0
An Improved Coarse-to-Fine Method for Solving Generation Tasks0
Translating a Math Word Problem to an Expression TreeCode0
Semantically-Aligned Equation Generation for Solving and Reasoning Math Word ProblemsCode0
Translating a Math Word Problem to a Expression Tree0
Neural Math Word Problem Solver with Reinforcement Learning0
Using Intermediate Representations to Solve Math Word Problems0
Deep Neural Solver for Math Word Problems0
Show:102550
← PrevPage 2 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Gemini 2.0 Flash ExperimentalAccuracy89.7Unverified
2Qwen2.5-Math-72B-Instruct(TIR,Greedy)Accuracy88.1Unverified
3GPT-4 Turbo (MACM, w/code, voting)Accuracy87.92Unverified
4Qwen2.5-Math-72B-Instruct(COT,Greedy)Accuracy85.9Unverified
5Qwen2.5-Math-7B-Instruct(TIR,Greedy)Accuracy85.2Unverified
6GPT-4-code model (CSV, w/ code, SC, k=16)Accuracy84.3Unverified
7Qwen2-Math-72B-Instruct(greedy)Accuracy84Unverified
8Qwen2.5-Math-7B-Instruct(COT,Greedy)Accuracy83.6Unverified
9Qwen2.5-Math-1.5B-Instruct(TIR,Greedy)Accuracy79.9Unverified
10OpenMath2-Llama3.1-70B (majority@256)Accuracy79.6Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4 DUPAccuracy94.2Unverified
2GPT-4 (Teaching-Inspired)Execution Accuracy93.9Unverified
3GPT-4 (Model Selection)Execution Accuracy93.7Unverified
4Qwen2(CoT + Code Interpreter)Execution Accuracy92.3Unverified
5GPT-4 (PHP)Execution Accuracy91.9Unverified
6OpenMath-CodeLlama-70B (w/ code)Execution Accuracy87.8Unverified
7MathCoder-L-70BExecution Accuracy84.9Unverified
8PoT_Eng (self-consistency @ 5)Execution Accuracy83.7Unverified
9CoT_Eng (self-consistency @ 5)Execution Accuracy82.5Unverified
10MMOS-CODE-34B(0-shot)Execution Accuracy80.6Unverified
#ModelMetricClaimedVerifiedStatus
1OpenMath-CodeLlama-70B (w/ code)Accuracy (%)95.7Unverified
2MsAT-DeductReasonerAccuracy (%)94.3Unverified
3ATHENA (roberta-large)Accuracy (%)93Unverified
4Exp-TreeAccuracy (%)92.3Unverified
5Multi-viewAccuracy (%)92.3Unverified
6ATHENA (roberta-base)Accuracy (%)92.2Unverified
7Roberta-DeductReasonerAccuracy (%)92Unverified
8DeBERTa (PM + VM)Accuracy (%)91Unverified
9EPTAccuracy (%)88.7Unverified
10Graph2Tree with RoBERTaAccuracy (%)88.7Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4 (Teaching-Inspired)Accuracy (5-fold)94.3Unverified
2ATHENA (roberta-large)Accuracy (training-test)86.5Unverified
3Multi-view* (ours)Accuracy (5-fold)85.2Unverified
4ATHENA (roberta-base)Accuracy (training-test)84.4Unverified
5Generate and RankAccuracy (5-fold)84.3Unverified
6Exp-TreeAccuracy (5-fold)84.1Unverified
7REAL2: Memory-augmented SolverAccuracy (5-fold)83.18Unverified
8Roberta-DeductReasonerAccuracy (5-fold)83Unverified
9MWP-BERTAccuracy (5-fold)82.4Unverified
10Recall and LearnAccuracy (5-fold)80.8Unverified