| Flash-VStream: Efficient Real-Time Understanding for Long Video Streams | Jun 30, 2025 | cross-modal alignmentEgoSchema | CodeCode Available | 3 |
| Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs | Jun 27, 2025 | MMEVideo MME | —Unverified | 0 |
| VideoDeepResearch: Long Video Understanding With Agentic Tool Using | Jun 12, 2025 | MMEVideo MME | CodeCode Available | 2 |
| DynTok: Dynamic Compression of Visual Tokens for Efficient and Effective Video Understanding | Jun 4, 2025 | MMEVideo MME | —Unverified | 0 |
| SiLVR: A Simple Language-based Video Reasoning Framework | May 30, 2025 | MathMME | CodeCode Available | 1 |
| VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation | May 20, 2025 | MMEMultiple-choice | CodeCode Available | 4 |
| TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos | Apr 24, 2025 | MMEVideo MME | CodeCode Available | 3 |
| FRAG: Frame Selection Augmented Generation for Long Video and Long Document Understanding | Apr 24, 2025 | document understandingMME | CodeCode Available | 1 |
| Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models | Apr 21, 2025 | MMEVideo MME | CodeCode Available | 4 |
| An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes | Apr 21, 2025 | MMEVideo MME | —Unverified | 0 |