| From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting | Dec 18, 2023 | DiversityGSM8K | —Unverified | 0 |
| An In-depth Look at Gemini's Language Abilities | Dec 18, 2023 | Instruction FollowingMath | CodeCode Available | 1 |
| Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning | Dec 14, 2023 | Arithmetic ReasoningFew-Shot Learning | —Unverified | 0 |
| Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent | Dec 14, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| TinyGSM: achieving >80% on GSM8k with small language models | Dec 14, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations | Dec 14, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 1 |
| Assessing GPT4-V on Structured Reasoning Tasks | Dec 13, 2023 | Code GenerationLanguage Modeling | —Unverified | 0 |
| Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning | Dec 9, 2023 | Arithmetic ReasoningMathematical Reasoning | CodeCode Available | 0 |
| Universal Self-Consistency for Large Language Model Generation | Nov 29, 2023 | Code GenerationLanguage Modeling | —Unverified | 0 |
| LANS: A Layout-Aware Neural Solver for Plane Geometry Problem | Nov 25, 2023 | Geometry Problem SolvingLanguage Modelling | —Unverified | 0 |