| Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising | May 20, 2025 | DecoderDenoising | —Unverified | 0 |
| A*-Decoding: Token-Efficient Inference Scaling | May 19, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Krikri: Advancing Open Large Language Models for Greek | May 19, 2025 | Code GenerationLanguage Modeling | —Unverified | 0 |
| Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation | May 19, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ReSW-VL: Representation Learning for Surgical Workflow Analysis Using Vision-Language Model | May 19, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping | May 19, 2025 | Contrastive LearningCross-Modal Retrieval | —Unverified | 0 |
| Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space | May 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| VocalAgent: Large Language Models for Vocal Health Diagnostics with Safety-Aware Evaluation | May 19, 2025 | DiagnosticLanguage Modeling | —Unverified | 0 |
| G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning | May 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| SurveillanceVQA-589K: A Benchmark for Comprehensive Surveillance Video-Language Understanding with Large Models | May 19, 2025 | Causal InferenceDecision Making | —Unverified | 0 |