| Reasoning Robustness of LLMs to Adversarial Typographical Errors | Nov 8, 2024 | GSM8KMMLU | —Unverified | 0 |
| Kwai-STaR: Transform LLMs into State-Transition Reasoners | Nov 7, 2024 | GSM8KMathematical Problem-Solving | —Unverified | 0 |
| Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding | Nov 6, 2024 | ARCGSM8K | CodeCode Available | 2 |
| Self-Consistency Preference Optimization | Nov 6, 2024 | GSM8KMath | —Unverified | 0 |
| Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models | Nov 2, 2024 | GSM8KMath | —Unverified | 0 |
| Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation | Oct 27, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization | Oct 27, 2024 | GSM8KHellaSwag | CodeCode Available | 1 |
| ReasonAgain: Using Extractable Symbolic Programs to Evaluate Mathematical Reasoning | Oct 24, 2024 | GSM8KMath | —Unverified | 0 |
| Scaling up Masked Diffusion Models on Text | Oct 24, 2024 | GSM8KLanguage Modeling | CodeCode Available | 3 |
| Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment | Oct 23, 2024 | GSM8KHumanEval | —Unverified | 0 |
| Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes | Oct 22, 2024 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| Optimizing Chain-of-Thought Reasoning: Tackling Arranging Bottleneck via Plan Augmentation | Oct 22, 2024 | GSM8KMath | —Unverified | 0 |
| SMART: Self-learning Meta-strategy Agent for Reasoning Tasks | Oct 21, 2024 | GSM8KSelf-Learning | CodeCode Available | 0 |
| On Designing Effective RL Reward at Training Time for LLM Reasoning | Oct 19, 2024 | GSM8KMath | —Unverified | 0 |
| TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling | Oct 18, 2024 | Computational EfficiencyGSM8K | —Unverified | 0 |
| SBI-RAG: Enhancing Math Word Problem Solving for Students through Schema-Based Instruction and Retrieval-Augmented Generation | Oct 17, 2024 | GSM8KLanguage Modeling | CodeCode Available | 0 |
| Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning | Oct 16, 2024 | AllGSM8K | CodeCode Available | 0 |
| MIND: Math Informed syNthetic Dialogues for Pretraining LLMs | Oct 15, 2024 | GSM8KMath | —Unverified | 0 |
| One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks | Oct 14, 2024 | FairnessGSM8K | CodeCode Available | 0 |
| How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective | Oct 14, 2024 | Density Ratio EstimationGSM8K | CodeCode Available | 0 |
| COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement | Oct 12, 2024 | Code GenerationComputational Efficiency | CodeCode Available | 0 |
| Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization | Oct 11, 2024 | GSM8KLanguage Modeling | CodeCode Available | 2 |
| Nudging: Inference-time Alignment of LLMs via Guided Decoding | Oct 11, 2024 | General KnowledgeGSM8K | —Unverified | 0 |
| SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights | Oct 11, 2024 | GSM8KMath | CodeCode Available | 4 |
| Towards Multilingual LLM Evaluation for European Languages | Oct 11, 2024 | ARCGSM8K | —Unverified | 0 |