| TOPA: Extending Large Language Models for Video Understanding via Text-Only Pre-Alignment | May 22, 2024 | EgoSchemaVideo Understanding | CodeCode Available | 1 |
| Agentic Keyframe Search for Video Question Answering | Mar 20, 2025 | EgoSchemaQuestion Answering | CodeCode Available | 1 |
| EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding | Aug 17, 2023 | DiagnosticEgoSchema | CodeCode Available | 1 |
| HCQA @ Ego4D EgoSchema Challenge 2024 | Jun 22, 2024 | Caption Generation | CodeCode Available | 1 |
| Too Many Frames, Not All Useful: Efficient Strategies for Long-Form Video QA | Jun 13, 2024 | AllEgoSchema | CodeCode Available | 1 |
| A Simple LLM Framework for Long-Range Video Question-Answering | Dec 28, 2023 | EgoSchemaLanguage Modelling | CodeCode Available | 1 |
| Language Repository for Long Video Understanding | Mar 21, 2024 | EgoSchemaQuestion Answering | CodeCode Available | 1 |
| LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videos | Dec 7, 2023 | EgoSchemaForm | CodeCode Available | 1 |
| VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs | Sep 30, 2024 | EgoSchemaLanguage Modelling | CodeCode Available | 1 |
| Memory Consolidation Enables Long-Context Video Understanding | Feb 8, 2024 | EgoSchemaVideo Understanding | —Unverified | 0 |