| PP-DocBee: Improving Multimodal Document Understanding Through a Bag of Tricks | Mar 6, 2025 | document understandingLanguage Modeling | —Unverified | 0 |
| AOLO: Analysis and Optimization For Low-Carbon Oriented Wireless Large Language Model Services | Mar 6, 2025 | Deep Reinforcement LearningLanguage Modeling | —Unverified | 0 |
| Scaling Rich Style-Prompted Text-to-Speech Datasets | Mar 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| The Next Frontier of LLM Applications: Open Ecosystems and Hardware Synergy | Mar 6, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities | Mar 6, 2025 | Audio captioningLanguage Modeling | —Unverified | 0 |
| Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities | Mar 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Know Thy Judge: On the Robustness Meta-Evaluation of LLM Safety Judges | Mar 6, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| An Egocentric Vision-Language Model based Portable Real-time Smart Assistant | Mar 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| L^2M: Mutual Information Scaling Law for Long-Context Language Modeling | Mar 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| From Idea to CAD: A Language Model-Driven Multi-Agent System for Collaborative Design | Mar 6, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |