| Egocentric Video-Language Pretraining | Jun 3, 2022 | Action RecognitionContrastive Learning | CodeCode Available | 2 | 5 |
| Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022 | Jul 4, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation | Jun 26, 2024 | Action AnticipationAction Recognition | CodeCode Available | 2 | 5 |
| Learning Video Representations from Large Language Models | Dec 8, 2022 | Action ClassificationAction Recognition | CodeCode Available | 2 | 5 |
| Learning video retrieval models with relevance-aware online mining | Mar 16, 2022 | Multi-Instance RetrievalRetrieval | CodeCode Available | 1 | 5 |
| HierVL: Learning Hierarchical Video-Language Embeddings | Jan 5, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone | Jul 11, 2023 | Action RecognitionMoment Queries | CodeCode Available | 1 | 5 |
| Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning | Mar 2, 2025 | Large Language ModelMulti-Instance Retrieval | CodeCode Available | 1 | 5 |
| EgoNCE++: Do Egocentric Video-Language Models Really Understand Hand-Object Interactions? | May 28, 2024 | Action RecognitionAttribute | CodeCode Available | 1 | 5 |
| Training a Large Video Model on a Single Machine in a Day | Sep 28, 2023 | Action RecognitionCPU | CodeCode Available | 1 | 5 |