Rescaling Egocentric Vision
Dima Damen, Hazel Doughty, Giovanni Maria Farinella, Antonino Furnari, Evangelos Kazakos, Jian Ma, Davide Moltisanti, Jonathan Munro, Toby Perrett, Will Price, Michael Wray
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/epic-kitchens/epic-kitchens-100-annotationsOfficialnone★ 168
- github.com/epic-kitchens/epic-kitchens-100-narratorOfficialIn papernone★ 25
- github.com/epic-kitchens/epic-kitchens-slowfastpytorch★ 35
- github.com/epic-kitchens/C1-Action-Recognition-TSN-TRN-TSMpytorch★ 33
- github.com/dibschat/tempAggpytorch★ 11
- github.com/jonmun/EPIC-KITCHENS-100_UDA_TA3Npytorch★ 8
- github.com/mustafa1728/TA3N-Lightningpytorch★ 1
Abstract
This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version, EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection enables new challenges such as action detection and evaluating the "test of time" - i.e. whether models trained on data collected in 2018 can generalise to new footage collected two years later. The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| EPIC-KITCHENS-100 | RU-LSTM | Recall@5 | 13.94 | — | Unverified |