| Pheromone-based Learning of Optimal Reasoning Paths | Jan 31, 2025 | ARCGSM8K | —Unverified | 0 |
| Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Boostrapping | Jan 31, 2025 | DenoisingImage Denoising | CodeCode Available | 0 |
| PixelWorld: Towards Perceiving Everything as Pixels | Jan 31, 2025 | Math | —Unverified | 0 |
| Examining the Robustness of Large Language Models across Language Complexity | Jan 30, 2025 | Math | —Unverified | 0 |
| Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis | Jan 30, 2025 | Automated Theorem ProvingMath | CodeCode Available | 1 |
| Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH | Jan 30, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate | Jan 29, 2025 | Instruction FollowingMath | CodeCode Available | 2 |
| Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving | Jan 28, 2025 | MathMathematical Problem-Solving | —Unverified | 0 |
| Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework | Jan 26, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| Clear Preferences Leave Traces: Reference Model-Guided Sampling for Preference Learning | Jan 25, 2025 | Math | —Unverified | 0 |