SOTAVerified

EgoSchema

Papers

Showing 110 of 40 papers

TitleStatusHype
VLog: Video-Language Models by Generative Retrieval of Narration VocabularyCode4
Lyra: An Efficient and Speech-Centric Framework for Omni-CognitionCode3
Flash-VStream: Efficient Real-Time Understanding for Long Video StreamsCode3
Video ReCap: Recursive Captioning of Hour-Long VideosCode3
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long VideosCode2
VideoAgent: Long-form Video Understanding with Large Language Model as AgentCode2
LLaVAction: evaluating and training multi-modal large language models for action recognitionCode2
Mobile-VideoGPT: Fast and Accurate Video Understanding Language ModelCode2
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded TuningCode2
EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language UnderstandingCode1
Show:102550
← PrevPage 1 of 4Next →

No leaderboard results yet.