| RevOrder: A Novel Method for Enhanced Arithmetic in Language Models | Feb 6, 2024 | GSM8KMath | —Unverified | 0 |
| Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision | Feb 5, 2024 | GSM8KMath | —Unverified | 0 |
| Improving Assessment of Tutoring Practices using Retrieval-Augmented Generation | Feb 4, 2024 | HallucinationMath | —Unverified | 0 |
| Salsa Fresca: Angular Embeddings and Pre-Training for ML Attacks on Learning With Errors | Feb 2, 2024 | Math | —Unverified | 0 |
| Large Language Models for Mathematical Reasoning: Progresses and Challenges | Jan 31, 2024 | DiversityMath | —Unverified | 0 |
| Efficient Tool Use with Chain-of-Abstraction Reasoning | Jan 30, 2024 | MathMathematical Reasoning | —Unverified | 0 |
| Taxonomy of Mathematical Plagiarism | Jan 30, 2024 | MathQuestion Answering | CodeCode Available | 0 |
| GAPS: Geometry-Aware Problem Solver | Jan 29, 2024 | Geometry Problem SolvingMath | —Unverified | 0 |
| YODA: Teacher-Student Progressive Learning for Language Models | Jan 28, 2024 | GSM8KMath | —Unverified | 0 |
| Exploring Educational Equity: A Machine Learning Approach to Unravel Achievement Disparities in Georgia | Jan 25, 2024 | Math | —Unverified | 0 |
| Using Java Geometry Expert as Guide in the Preparations for Math Contests | Jan 22, 2024 | Math | —Unverified | 0 |
| Self-Imagine: Effective Unimodal Reasoning with Multimodal Models using Self-Imagination | Jan 16, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities | Jan 13, 2024 | MathMathematical Reasoning | —Unverified | 0 |
| Cramer-Rao bound and absolute sensitivity in chemical reaction networks | Jan 13, 2024 | MathSensitivity | —Unverified | 0 |
| Using Large Language Models to Assess Tutors' Performance in Reacting to Students Making Math Errors | Jan 6, 2024 | Math | —Unverified | 0 |
| Graph2Tac: Online Representation Learning of Formal Math Concepts | Jan 5, 2024 | AI AgentAutomated Theorem Proving | —Unverified | 0 |
| Mastery Guided Non-parametric Clustering to Scale-up Strategy Prediction | Jan 4, 2024 | ClusteringFairness | —Unverified | 0 |
| Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities | Dec 22, 2023 | ChatbotGSM8K | —Unverified | 0 |
| From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting | Dec 18, 2023 | DiversityGSM8K | —Unverified | 0 |
| TinyGSM: achieving >80% on GSM8k with small language models | Dec 14, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning | Dec 14, 2023 | Arithmetic ReasoningFew-Shot Learning | —Unverified | 0 |
| Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models | Dec 11, 2023 | DiversityMath | —Unverified | 0 |
| LaRS: Latent Reasoning Skills for Chain-of-Thought Reasoning | Dec 7, 2023 | In-Context LearningMath | —Unverified | 0 |
| ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions | Dec 4, 2023 | Arithmetic ReasoningMath | CodeCode Available | 0 |
| REDS: Resource-Efficient Deep Subnetworks for Dynamic Resource Constraints | Nov 22, 2023 | Computational EfficiencyMath | —Unverified | 0 |
| First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning | Nov 14, 2023 | GSM8KMath | —Unverified | 0 |
| SAIE Framework: Support Alone Isn't Enough -- Advancing LLM Training with Adversarial Remarks | Nov 14, 2023 | GSM8KMath | —Unverified | 0 |
| VerityMath: Advancing Mathematical Reasoning by Self-Verification Through Unit Consistency | Nov 13, 2023 | MathMathematical Reasoning | CodeCode Available | 0 |
| Large Language Models' Understanding of Math: Source Criticism and Extrapolation | Nov 12, 2023 | Automated Theorem ProvingMath | —Unverified | 0 |
| Let's Reinforce Step by Step | Nov 10, 2023 | GSM8KLogical Reasoning | —Unverified | 0 |
| Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models | Nov 7, 2023 | Language ModellingLarge Language Model | CodeCode Available | 0 |
| Enhancing LLM Intelligence with ARM-RAG: Auxiliary Rationale Memory for Retrieval Augmented Generation | Nov 7, 2023 | MathRAG | —Unverified | 0 |
| ATHENA: Mathematical Reasoning with Thought Expansion | Nov 2, 2023 | MathMathematical Reasoning | CodeCode Available | 0 |
| Unleashing the Creative Mind: Language Model As Hierarchical Policy For Improved Exploration on Challenging Problem Solving | Nov 1, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 0 |
| Exploring the Reliability of Large Language Models as Customized Evaluators for Diverse NLP Tasks | Oct 30, 2023 | FairnessMath | CodeCode Available | 0 |
| math-PVS: A Large Language Model Framework to Map Scientific Publications to PVS Theories | Oct 25, 2023 | Automated Theorem ProvingLanguage Modeling | —Unverified | 0 |
| We are Who We Cite: Bridges of Influence Between Natural Language Processing and Other Academic Fields | Oct 23, 2023 | DiversityMath | CodeCode Available | 0 |
| SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving | Oct 19, 2023 | GSM8KMath | CodeCode Available | 0 |
| Let's reward step by step: Step-Level reward model as the Navigators for Reasoning | Oct 16, 2023 | Code GenerationGSM8K | —Unverified | 0 |
| Improving Large Language Model Fine-tuning for Solving Math Problems | Oct 16, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Solving Math Word Problems with Reexamination | Oct 14, 2023 | DescriptiveMath | CodeCode Available | 0 |
| The Search-and-Mix Paradigm in Approximate Nash Equilibrium Algorithms | Oct 12, 2023 | Math | —Unverified | 0 |
| LLMs as Potential Brainstorming Partners for Math and Science Problems | Oct 10, 2023 | Math | —Unverified | 0 |
| Guiding Language Model Reasoning with Planning Tokens | Oct 9, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Critique Ability of Large Language Models | Oct 7, 2023 | Code CompletionDecision Making | —Unverified | 0 |
| Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models | Oct 7, 2023 | Math | —Unverified | 0 |
| Analysis of the Reasoning with Redundant Information Provided Ability of Large Language Models | Oct 6, 2023 | 8kMath | —Unverified | 0 |
| Concise and Organized Perception Facilitates Reasoning in Large Language Models | Oct 5, 2023 | LAMBADAMath | —Unverified | 0 |
| The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices | Oct 4, 2023 | ArticlesMath | CodeCode Available | 0 |
| Large Language Models as Analogical Reasoners | Oct 3, 2023 | Code GenerationGSM8K | —Unverified | 0 |