| DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs | May 15, 2025 | BenchmarkingFairness | —Unverified | 0 |
| Towards a Deeper Understanding of Reasoning Capabilities in Large Language Models | May 15, 2025 | Large Language ModelMath | CodeCode Available | 0 |
| PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning | May 14, 2025 | MathMathematical Problem-Solving | CodeCode Available | 0 |
| Accelerating Chain-of-Thought Reasoning: When Goal-Gradient Importance Meets Dynamic Skipping | May 13, 2025 | Domain GeneralizationGSM8K | —Unverified | 0 |
| Measurement to Meaning: A Validity-Centered Framework for AI Evaluation | May 13, 2025 | Math | —Unverified | 0 |
| S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models | May 12, 2025 | GSM8KLarge Language Model | —Unverified | 0 |
| Learning from Peers in Reasoning Models | May 12, 2025 | Math | —Unverified | 0 |
| Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach | May 12, 2025 | MathMulti-Task Learning | —Unverified | 0 |
| DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs | May 11, 2025 | DiversityMath | —Unverified | 0 |
| xGen-small Technical Report | May 10, 2025 | DecoderMath | —Unverified | 0 |