| NeuralNexus at BEA 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in AI Tutors | Jun 12, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Beyond Gold Standards: Epistemic Ensemble of LLM Judges for Formal Mathematical Reasoning | Jun 12, 2025 | Mathematical Reasoning | —Unverified | 0 |
| Slimming Down LLMs Without Losing Their Minds | Jun 12, 2025 | Computational EfficiencyGSM8K | —Unverified | 0 |
| RePO: Replay-Enhanced Policy Optimization | Jun 11, 2025 | MathMathematical Reasoning | CodeCode Available | 1 |
| Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning | Jun 11, 2025 | Image CaptioningMath | CodeCode Available | 2 |
| Large Language Models for Design Structure Matrix Optimization | Jun 11, 2025 | Combinatorial OptimizationMathematical Reasoning | —Unverified | 0 |
| Towards Efficient and Effective Alignment of Large Language Models | Jun 11, 2025 | Mathematical ReasoningMeta-Learning | —Unverified | 0 |
| CoRT: Code-integrated Reasoning within Thinking | Jun 11, 2025 | Mathematical Reasoning | CodeCode Available | 2 |
| Omni-DPO: A Dual-Perspective Paradigm for Dynamic Preference Learning of LLMs | Jun 11, 2025 | Mathematical Reasoning | CodeCode Available | 0 |
| Large Language Models Have Intrinsic Meta-Cognition, but Need a Good Lens | Jun 10, 2025 | BenchmarkingMathematical Reasoning | —Unverified | 0 |