| Pheromone-based Learning of Optimal Reasoning Paths | Jan 31, 2025 | ARCGSM8K | —Unverified | 0 |
| PixelWorld: Towards Perceiving Everything as Pixels | Jan 31, 2025 | Math | —Unverified | 0 |
| Fairshare Data Pricing via Data Valuation for Large Language Models | Jan 31, 2025 | Data ValuationMath | —Unverified | 0 |
| Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH | Jan 30, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Examining the Robustness of Large Language Models across Language Complexity | Jan 30, 2025 | Math | —Unverified | 0 |
| Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving | Jan 28, 2025 | MathMathematical Problem-Solving | —Unverified | 0 |
| Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework | Jan 26, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| Clear Preferences Leave Traces: Reference Model-Guided Sampling for Preference Learning | Jan 25, 2025 | Math | —Unverified | 0 |
| DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images | Jan 24, 2025 | Math | —Unverified | 0 |
| Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages | Jan 23, 2025 | Instruction FollowingMath | —Unverified | 0 |