Math Word Problem Solving
A math word problem is a mathematical exercise (such as in a textbook, worksheet, or exam) where significant background information on the problem is presented in ordinary language rather than in mathematical notation. As most word problems involve a narrative of some sort, they are sometimes referred to as story problems and may vary in the amount of technical language used.
Papers
Showing 1–10 of 107 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Gemini 2.0 Flash Experimental | Accuracy | 89.7 | — | Unverified |
| 2 | Qwen2.5-Math-72B-Instruct(TIR,Greedy) | Accuracy | 88.1 | — | Unverified |
| 3 | GPT-4 Turbo (MACM, w/code, voting) | Accuracy | 87.92 | — | Unverified |
| 4 | Qwen2.5-Math-72B-Instruct(COT,Greedy) | Accuracy | 85.9 | — | Unverified |
| 5 | Qwen2.5-Math-7B-Instruct(TIR,Greedy) | Accuracy | 85.2 | — | Unverified |
| 6 | GPT-4-code model (CSV, w/ code, SC, k=16) | Accuracy | 84.3 | — | Unverified |
| 7 | Qwen2-Math-72B-Instruct(greedy) | Accuracy | 84 | — | Unverified |
| 8 | Qwen2.5-Math-7B-Instruct(COT,Greedy) | Accuracy | 83.6 | — | Unverified |
| 9 | Qwen2.5-Math-1.5B-Instruct(TIR,Greedy) | Accuracy | 79.9 | — | Unverified |
| 10 | OpenMath2-Llama3.1-70B (majority@256) | Accuracy | 79.6 | — | Unverified |