| Interleaved Reasoning for Large Language Models via Reinforcement Learning | May 26, 2025 | Logical ReasoningMath | —Unverified | 0 |
| Efficient Data Selection at Scale via Influence Distillation | May 25, 2025 | GSM8KMMLU | —Unverified | 0 |
| The Price of Format: Diversity Collapse in LLMs | May 25, 2025 | DiversityGSM8K | CodeCode Available | 0 |
| BnMMLU: Measuring Massive Multitask Language Understanding in Bengali | May 25, 2025 | General KnowledgeMMLU | CodeCode Available | 0 |
| LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning | May 24, 2025 | Computational EfficiencyMMLU | CodeCode Available | 0 |
| B-score: Detecting biases in large language models using response history | May 24, 2025 | MMLU | —Unverified | 0 |
| INFERENCEDYNAMICS: Efficient Routing Across LLMs through Structured Capability and Knowledge Profiling | May 22, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Training Step-Level Reasoning Verifiers with Formal Verification Tools | May 21, 2025 | Formal LogicMath | CodeCode Available | 1 |
| Cost-aware LLM-based Online Dataset Annotation | May 21, 2025 | MMLU | —Unverified | 0 |
| Dual Decomposition of Weights and Singular Value Low Rank Adaptation | May 20, 2025 | GSM8KMMLU | —Unverified | 0 |