| Towards Efficient and Effective Alignment of Large Language Models | Jun 11, 2025 | Mathematical ReasoningMeta-Learning | —Unverified | 0 |
| Large Language Models for Design Structure Matrix Optimization | Jun 11, 2025 | Combinatorial OptimizationMathematical Reasoning | —Unverified | 0 |
| Large Language Models Have Intrinsic Meta-Cognition, but Need a Good Lens | Jun 10, 2025 | BenchmarkingMathematical Reasoning | —Unverified | 0 |
| A Survey on Large Language Models for Mathematical Reasoning | Jun 10, 2025 | Answer GenerationMathematical Reasoning | —Unverified | 0 |
| Can A Gamer Train A Mathematical Reasoning Model? | Jun 10, 2025 | GPUMathematical Reasoning | CodeCode Available | 0 |
| VReST: Enhancing Reasoning in Large Vision-Language Models through Tree Search and Self-Reward Mechanism | Jun 10, 2025 | Mathematical ReasoningVisual Reasoning | CodeCode Available | 0 |
| Temporalizing Confidence: Evaluation of Chain-of-Thought Reasoning with Signal Temporal Logic | Jun 9, 2025 | Mathematical Reasoning | —Unverified | 0 |
| Can Theoretical Physics Research Benefit from Language Agents? | Jun 6, 2025 | Code GenerationMathematical Reasoning | —Unverified | 0 |
| LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement Learning | Jun 5, 2025 | Mathematical Reasoningreinforcement-learning | CodeCode Available | 0 |
| Multi-Layer GRPO: Enhancing Reasoning and Self-Correction in Large Language Models | Jun 5, 2025 | Mathematical Reasoning | —Unverified | 0 |
| Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning | Jun 5, 2025 | Mathematical ReasoningProblem Decomposition | —Unverified | 0 |
| ProRefine: Inference-time Prompt Refinement with Textual Feedback | Jun 5, 2025 | Mathematical Reasoning | —Unverified | 0 |
| Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic Reasoning | Jun 5, 2025 | Arithmetic ReasoningMath | CodeCode Available | 0 |
| VideoMathQA: Benchmarking Mathematical Reasoning via Multimodal Understanding in Videos | Jun 5, 2025 | BenchmarkingMathematical Reasoning | —Unverified | 0 |
| Revisiting Test-Time Scaling: A Survey and a Diversity-Aware Method for Efficient Reasoning | Jun 5, 2025 | DiversityMathematical Reasoning | —Unverified | 0 |
| Adaptive Graph Pruning for Multi-Agent Communication | Jun 3, 2025 | Code GenerationLarge Language Model | CodeCode Available | 0 |
| WebChoreArena: Evaluating Web Browsing Agents on Realistic Tedious Web Tasks | Jun 2, 2025 | Large Language ModelMathematical Reasoning | —Unverified | 0 |
| Uni-LoRA: One Vector is All You Need | Jun 1, 2025 | AllMathematical Reasoning | —Unverified | 0 |
| GThinker: Towards General Multimodal Reasoning via Cue-Guided Rethinking | Jun 1, 2025 | 4kMath | CodeCode Available | 0 |
| Speculative Reward Model Boosts Decision Making Ability of LLMs Cost-Effectively | May 31, 2025 | Decision MakingMathematical Reasoning | CodeCode Available | 0 |
| Evaluation of LLMs for mathematical problem solving | May 30, 2025 | GSM8KMathematical Problem-Solving | —Unverified | 0 |
| RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation | May 30, 2025 | Code GenerationDiversity | CodeCode Available | 0 |
| On-Policy RL with Optimal Reward Baseline | May 29, 2025 | Large Language ModelMathematical Reasoning | —Unverified | 0 |
| Scaling up the think-aloud method | May 29, 2025 | Mathematical Reasoning | CodeCode Available | 0 |
| Probability-Consistent Preference Optimization for Enhanced LLM Reasoning | May 29, 2025 | Mathematical Reasoning | CodeCode Available | 0 |