| VocalAgent: Large Language Models for Vocal Health Diagnostics with Safety-Aware Evaluation | May 19, 2025 | DiagnosticLanguage Modeling | —Unverified | 0 |
| Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space | May 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping | May 19, 2025 | Contrastive LearningCross-Modal Retrieval | —Unverified | 0 |
| Temporal-Oriented Recipe for Transferring Large Vision-Language Model to Video Understanding | May 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| The Traitors: Deception and Trust in Multi-Agent Language Model Simulations | May 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| On the Thinking-Language Modeling Gap in Large Language Models | May 19, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| R1dacted: Investigating Local Censorship in DeepSeek's R1 Language Model | May 19, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| TinyAlign: Boosting Lightweight Vision-Language Models by Mitigating Modal Alignment Bottlenecks | May 19, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation | May 19, 2025 | Knowledge DistillationLanguage Modeling | —Unverified | 0 |
| ORQA: A Benchmark and Foundation Model for Holistic Operating Room Modeling | May 19, 2025 | Graph GenerationKnowledge Distillation | —Unverified | 0 |
| SurveillanceVQA-589K: A Benchmark for Comprehensive Surveillance Video-Language Understanding with Large Models | May 19, 2025 | Causal InferenceDecision Making | —Unverified | 0 |
| VLC Fusion: Vision-Language Conditioned Sensor Fusion for Robust Object Detection | May 19, 2025 | Autonomous DrivingLanguage Modeling | —Unverified | 0 |
| IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment | May 19, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence | May 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| 3D Visual Illusion Depth Estimation | May 19, 2025 | Common Sense ReasoningDepth Estimation | CodeCode Available | 1 |
| G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning | May 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO | May 19, 2025 | DecoderImage Generation | CodeCode Available | 0 |
| Tianyi: A Traditional Chinese Medicine all-rounder language model and its Real-World Clinical Practice | May 19, 2025 | AllHallucination | —Unverified | 0 |
| Structure-Aware Corpus Construction and User-Perception-Aligned Metrics for Large-Language-Model Code Completion | May 19, 2025 | Code CompletionLanguage Modeling | —Unverified | 0 |
| R3: Robust Rubric-Agnostic Reward Models | May 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| CIE: Controlling Language Model Text Generations Using Continuous Signals | May 19, 2025 | continuous-controlContinuous Control | CodeCode Available | 0 |
| A Physics-Inspired Optimizer: Velocity Regularized Adam | May 19, 2025 | image-classificationImage Classification | —Unverified | 0 |
| CMLFormer: A Dual Decoder Transformer with Switching Point Learning for Code-Mixed Language Modeling | May 19, 2025 | DecoderLanguage Modeling | —Unverified | 0 |
| LLM-Based User Simulation for Low-Knowledge Shilling Attacks on Recommender Systems | May 18, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| DS-ProGen: A Dual-Structure Deep Language Model for Functional Protein Design | May 18, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |