| Using Large Language Model to Solve and Explain Physics Word Problems Approaching Human Level | Sep 15, 2023 | Few-Shot LearningHigh School Physics | —Unverified | 0 |
| MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning | Sep 11, 2023 | MathMathematical Reasoning | CodeCode Available | 2 |
| GPT Can Solve Mathematical Problems Without a Calculator | Sep 6, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| MathAttack: Attacking Large Language Models Towards Math Solving Ability | Sep 4, 2023 | Adversarial AttackGSM8K | —Unverified | 0 |
| Solving Math Word Problem with Problem Type Classification | Aug 26, 2023 | Answer SelectionClassification | CodeCode Available | 0 |
| WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct | Aug 18, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 5 |
| GraphReason: Enhancing Reasoning Capabilities of Large Language Models through A Graph-Based Verification Approach | Aug 18, 2023 | Math | —Unverified | 0 |
| Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification | Aug 15, 2023 | Arithmetic ReasoningMath | CodeCode Available | 2 |
| Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems | Aug 10, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Towards an AI to Win Ghana's National Science and Maths Quiz | Aug 8, 2023 | MathQuestion Answering | CodeCode Available | 1 |
| NEOLAF, an LLM-powered neural-symbolic cognitive architecture | Aug 8, 2023 | Incremental LearningMath | —Unverified | 0 |
| Cumulative Reasoning with Large Language Models | Aug 8, 2023 | Decision MakingLogical Reasoning | CodeCode Available | 2 |
| Scalable and Equitable Math Problem Solving Strategy Prediction in Big Educational Data | Aug 7, 2023 | MathMisconceptions | CodeCode Available | 0 |
| Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning | Aug 7, 2023 | In-Context LearningMath | CodeCode Available | 0 |
| Studying Large Language Model Generalization with Influence Functions | Aug 7, 2023 | counterfactualLanguage Modeling | CodeCode Available | 1 |
| A Symbolic Character-Aware Model for Solving Geometry Problems | Aug 5, 2023 | MathMulti-Label Classification | CodeCode Available | 1 |
| MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities | Aug 4, 2023 | MathMM-Vet | CodeCode Available | 2 |
| Reasoning in Large Language Models Through Symbolic Math Word Problems | Aug 3, 2023 | Math | CodeCode Available | 0 |
| Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models | Aug 1, 2023 | In-Context LearningMath | —Unverified | 0 |
| SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning | Aug 1, 2023 | GSM8KMath | CodeCode Available | 1 |
| Augmented Math: Authoring AR-Based Explorable Explanations by Augmenting Static Math Textbooks | Jul 30, 2023 | MathOptical Character Recognition | CodeCode Available | 0 |
| A large language model-assisted education tool to provide feedback on open-ended responses | Jul 25, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| ARB: Advanced Reasoning Benchmark for Large Language Models | Jul 25, 2023 | Math | —Unverified | 0 |
| Explaining Math Word Problem Solvers | Jul 24, 2023 | Math | —Unverified | 0 |
| Controlling Equational Reasoning in Large Language Models with Prompt Interventions | Jul 19, 2023 | HallucinationIn-Context Learning | —Unverified | 0 |