| UniCode^2: Cascaded Large-scale Codebooks for Unified Multimodal Understanding and Generation | Jun 25, 2025 | 16k | —Unverified | 0 |
| MSTAR: Box-free Multi-query Scene Text Retrieval with Attention Recycling | Jun 12, 2025 | 16kRetrieval | CodeCode Available | 0 |
| How Far Are We from Optimal Reasoning Efficiency? | Jun 8, 2025 | 16kBenchmarking | CodeCode Available | 0 |
| FlashDMoE: Fast Distributed MoE in a Single Kernel | Jun 5, 2025 | 16kCPU | CodeCode Available | 3 |
| FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and Italian | May 28, 2025 | 16k | CodeCode Available | 0 |
| UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents | May 27, 2025 | 16k | CodeCode Available | 2 |
| SpecExtend: A Drop-in Enhancement for Speculative Decoding of Long Sequences | May 27, 2025 | 16kLong-Context Understanding | CodeCode Available | 0 |
| MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention | May 24, 2025 | 16k4k | CodeCode Available | 1 |
| Training Long-Context LLMs Efficiently via Chunk-wise Optimization | May 22, 2025 | 16kGPU | CodeCode Available | 2 |
| PSC: Extending Context Window of Large Language Models via Phase Shift Calibration | May 18, 2025 | 16kPosition | CodeCode Available | 0 |