| Regress, Don't Guess -- A Regression-like Loss on Number Tokens for Language Models | Nov 4, 2024 | Inductive BiasLanguage Modeling | CodeCode Available | 1 |
| Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency | Oct 28, 2024 | Math | CodeCode Available | 1 |
| Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics | Oct 28, 2024 | Arithmetic ReasoningMath | CodeCode Available | 1 |
| Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes | Oct 22, 2024 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| Non-myopic Generation of Language Models for Reasoning and Planning | Oct 22, 2024 | Computational EfficiencyLanguage Modelling | CodeCode Available | 1 |
| LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks | Oct 16, 2024 | Mathparameter-efficient fine-tuning | CodeCode Available | 1 |
| CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning | Oct 14, 2024 | MathMathematical Reasoning | CodeCode Available | 1 |
| HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics | Oct 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| The Geometry of Concepts: Sparse Autoencoder Feature Structure | Oct 10, 2024 | Math | CodeCode Available | 1 |
| DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback | Oct 8, 2024 | MathSequential Decision Making | CodeCode Available | 1 |
| LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits | Oct 2, 2024 | Instruction FollowingMath | CodeCode Available | 1 |
| BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search | Sep 26, 2024 | MathMathematical Problem-Solving | CodeCode Available | 1 |
| To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning | Sep 18, 2024 | MathMMLU | CodeCode Available | 1 |
| MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning | Sep 18, 2024 | Math | CodeCode Available | 1 |
| Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement | Sep 17, 2024 | Active LearningDiversity | CodeCode Available | 1 |
| Explaining Datasets in Words: Statistical Models with Natural Language Parameters | Sep 13, 2024 | ClusteringLanguage Modeling | CodeCode Available | 1 |
| Sirius: Contextual Sparsity with Correction for Efficient LLMs | Sep 5, 2024 | Math | CodeCode Available | 1 |
| MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models | Aug 30, 2024 | Image CaptioningLanguage Modeling | CodeCode Available | 1 |
| What makes math problems hard for reinforcement learning: a case study | Aug 27, 2024 | MathReinforcement Learning (RL) | CodeCode Available | 1 |
| SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models | Aug 21, 2024 | 8kGSM8K | CodeCode Available | 1 |
| Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning | Aug 16, 2024 | MathMathematical Reasoning | CodeCode Available | 1 |
| Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization | Aug 14, 2024 | InformativenessInstruction Following | CodeCode Available | 1 |
| Mathfish: Evaluating Language Model Math Reasoning via Grounding in Educational Curricula | Aug 8, 2024 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents | Aug 2, 2024 | Code GenerationLarge Language Model | CodeCode Available | 1 |
| Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching | Jul 24, 2024 | Math | CodeCode Available | 1 |