| Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation | Apr 16, 2025 | GSM8KMath | —Unverified | 0 |
| Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution | Apr 13, 2025 | GSM8KMath | CodeCode Available | 3 |
| Question Tokens Deserve More Attention: Enhancing Large Language Models without Training through Step-by-Step Reading and Question Attention Recalibration | Apr 13, 2025 | GSM8K | —Unverified | 0 |
| Supervised Optimism Correction: Be Confident When LLMs Are Sure | Apr 10, 2025 | GSM8KMath | —Unverified | 0 |
| Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use | Apr 7, 2025 | GSM8KMath | —Unverified | 0 |
| SEAL: Steerable Reasoning Calibration of Large Language Models for Free | Apr 7, 2025 | GSM8K | CodeCode Available | 2 |
| Sample, Don't Search: Rethinking Test-Time Alignment for Language Models | Apr 4, 2025 | GSM8KMathematical Reasoning | —Unverified | 0 |
| Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency | Apr 4, 2025 | BenchmarkingGSM8K | —Unverified | 0 |
| Large (Vision) Language Models are Unsupervised In-Context Learners | Apr 3, 2025 | GSM8KIn-Context Learning | CodeCode Available | 1 |
| Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models | Apr 3, 2025 | GSM8KReinforcement Learning (RL) | CodeCode Available | 0 |