| MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection | Apr 17, 2025 | Anomaly DetectionData Augmentation | —Unverified | 0 |
| Rethinking the Generation of High-Quality CoT Data from the Perspective of LLM-Adaptive Question Difficulty Grading | Apr 16, 2025 | 2kCode Generation | —Unverified | 0 |
| Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation | Apr 16, 2025 | GSM8KMath | —Unverified | 0 |
| Reinforcement Learning from Human Feedback | Apr 16, 2025 | MathPhilosophy | CodeCode Available | 5 |
| ReTool: Reinforcement Learning for Strategic Tool Use in LLMs | Apr 15, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit Generation | Apr 15, 2025 | MathQuantum Machine Learning | CodeCode Available | 1 |
| Heimdall: test-time scaling on the generative verification | Apr 14, 2025 | Math | —Unverified | 0 |
| M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models | Apr 14, 2025 | MambaMath | CodeCode Available | 1 |
| Efficient Process Reward Model Training via Active Learning | Apr 14, 2025 | Active LearningMath | CodeCode Available | 1 |
| The Jailbreak Tax: How Useful are Your Jailbreak Outputs? | Apr 14, 2025 | Math | CodeCode Available | 1 |