| Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models | Mar 27, 2025 | Data VisualizationMath | CodeCode Available | 0 |
| Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad | Mar 27, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| Entropy-Aware Branching for Improved Mathematical Reasoning | Mar 27, 2025 | Mathematical Reasoning | —Unverified | 0 |
| R-PRM: Reasoning-Driven Process Reward Modeling | Mar 27, 2025 | Mathematical Reasoning | CodeCode Available | 1 |
| SWI: Speaking with Intent in Large Language Models | Mar 27, 2025 | Mathematical ReasoningQuestion Answering | CodeCode Available | 0 |
| Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks | Mar 27, 2025 | Imitation LearningMathematical Reasoning | CodeCode Available | 2 |
| MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams | Mar 26, 2025 | Mathematical ReasoningObject Counting | —Unverified | 0 |
| Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence | Mar 26, 2025 | Mathematical Reasoning | CodeCode Available | 0 |
| Learning to chain-of-thought with Jensen's evidence lower bound | Mar 25, 2025 | Mathematical Reasoningreinforcement-learning | —Unverified | 0 |
| Process or Result? Manipulated Ending Tokens Can Mislead Reasoning LLMs to Ignore the Correct Reasoning Steps | Mar 25, 2025 | Mathematical Reasoning | —Unverified | 0 |