| WHISTRESS: Enriching Transcriptions with Sentence Stress Detection | May 25, 2025 | SentenceZero-shot Generalization | —Unverified | 0 |
| ASPO: Adaptive Sentence-Level Preference Optimization for Fine-Grained Multimodal Reasoning | May 25, 2025 | Computational EfficiencyMultimodal Reasoning | —Unverified | 0 |
| Building a Functional Machine Translation Corpus for Kpelle | May 24, 2025 | Data AugmentationLanguage Modelling | —Unverified | 0 |
| MedScore: Factuality Evaluation of Free-Form Medical Answers | May 24, 2025 | FormHallucination | CodeCode Available | 0 |
| PD^3: A Project Duplication Detection Framework via Adapted Multi-Agent Debate | May 23, 2025 | Sentence | —Unverified | 0 |
| Multi-Scale Probabilistic Generation Theory: A Hierarchical Framework for Interpreting Large Language Models | May 23, 2025 | Sentence | —Unverified | 0 |
| Memorization or Reasoning? Exploring the Idiom Understanding of LLMs | May 22, 2025 | Machine TranslationMemorization | —Unverified | 0 |
| A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP | May 22, 2025 | Continual PretrainingDiagnostic | CodeCode Available | 0 |
| SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning | May 22, 2025 | Sentence | —Unverified | 0 |
| LLMs Are Not Scorers: Rethinking MT Evaluation with Generation-Based Methods | May 22, 2025 | DecoderMachine Translation | CodeCode Available | 0 |