| RL-finetuning LLMs from on- and off-policy data with a single algorithm | Mar 25, 2025 | Mathematical Reasoning | —Unverified | 0 |
| Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions | Jun 7, 2024 | HallucinationMathematical Reasoning | —Unverified | 0 |
| RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library | Apr 29, 2025 | Data AugmentationMathematical Reasoning | —Unverified | 0 |
| S^3c-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners | Sep 3, 2024 | GSM8KMath | —Unverified | 0 |
| SAAS: Solving Ability Amplification Strategy for Enhanced Mathematical Reasoning in Large Language Models | Apr 5, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking | Dec 12, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Sample, Don't Search: Rethinking Test-Time Alignment for Language Models | Apr 4, 2025 | GSM8KMathematical Reasoning | —Unverified | 0 |
| Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search | Feb 4, 2025 | Mathematical Reasoning | —Unverified | 0 |
| SAT Solvers and Computer Algebra Systems: A Powerful Combination for Mathematics | Jul 9, 2019 | Mathematical ProofsMathematical Reasoning | —Unverified | 0 |
| SEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy Optimization | May 18, 2025 | MathMathematical Reasoning | —Unverified | 0 |