| Q-Agent: Quality-Driven Chain-of-Thought Image Restoration Agent through Robust Multimodal Large Language Model | Apr 9, 2025 | Image Quality AssessmentImage Restoration | —Unverified | 0 |
| TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling | Apr 9, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Societal Impacts Research Requires Benchmarks for Creative Composition Tasks | Apr 9, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought | Apr 8, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| InstructMPC: A Human-LLM-in-the-Loop Framework for Context-Aware Control | Apr 8, 2025 | energy managementLanguage Modeling | —Unverified | 0 |
| Simplifying Data Integration: SLM-Driven Systems for Unified Semantic Queries Across Heterogeneous Databases | Apr 8, 2025 | Data IntegrationLanguage Modeling | —Unverified | 0 |
| DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation | Apr 7, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Evaluating Knowledge Graph Based Retrieval Augmented Generation Methods under Knowledge Incompleteness | Apr 7, 2025 | Knowledge GraphsLanguage Modeling | —Unverified | 0 |
| Towards Visual Text Grounding of Multimodal Large Language Model | Apr 7, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration | Apr 7, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |