| Matrix Information Theory for Self-Supervised Learning | May 27, 2023 | Contrastive LearningGSM8K | CodeCode Available | 1 | 5 |
| FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving | Feb 27, 2025 | GSM8KMath | CodeCode Available | 1 | 5 |
| Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates | Feb 28, 2024 | GSM8KSafety Alignment | CodeCode Available | 1 | 5 |
| Learning Goal-Conditioned Representations for Language Reward Models | Jul 18, 2024 | GSM8KMath | CodeCode Available | 1 | 5 |
| SMART: Self-Aware Agent for Tool Overuse Mitigation | Feb 17, 2025 | GSM8KLarge Language Model | CodeCode Available | 1 | 5 |
| LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback | Jun 20, 2024 | Binary ClassificationGSM8K | CodeCode Available | 1 | 5 |
| Re-Initialization Token Learning for Tool-Augmented Large Language Models | Jun 17, 2025 | GSM8KQuestion Answering | CodeCode Available | 0 | 5 |
| Fill in the Blank: Exploring and Enhancing LLM Capabilities for Backward Reasoning in Math Word Problems | Oct 3, 2023 | GSM8KMath | CodeCode Available | 0 | 5 |
| COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement | Oct 12, 2024 | Code GenerationComputational Efficiency | CodeCode Available | 0 | 5 |
| Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models | Apr 3, 2025 | GSM8KReinforcement Learning (RL) | CodeCode Available | 0 | 5 |