| Video Swin Transformers for Egocentric Video Understanding @ Ego4D Challenges 2022 | Jul 22, 2022 | ObjectObject State Change Classification | —Unverified | 0 |
| Watch and Learn: Leveraging Expert Knowledge and Language for Surgical Video Understanding | Mar 14, 2025 | DenoisingDense Video Captioning | —Unverified | 0 |
| Weakly-Supervised Temporal Localization via Occurrence Count Learning | May 17, 2019 | Onset DetectionTemporal Localization | —Unverified | 0 |
| What do I Annotate Next? An Empirical Study of Active Learning for Action Localization | Sep 1, 2018 | Action LocalizationActive Learning | —Unverified | 0 |
| Asynchronous Temporal Fields for Action Recognition | Dec 19, 2016 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| Learning to Localize Temporal Events in Large-scale Video Data | Oct 25, 2019 | Temporal LocalizationVideo Recognition | CodeCode Available | 0 |
| SoftLoc: Robust Temporal Localization under Label Misalignment | Sep 25, 2019 | PositionTemporal Localization | CodeCode Available | 0 |
| Am I Done? Predicting Action Progress in Videos | May 4, 2017 | Action DetectionTemporal Localization | CodeCode Available | 0 |
| When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs | Feb 16, 2022 | Action LocalizationTemporal Action Localization | CodeCode Available | 0 |
| Transforming faces into video stories -- VideoFace2.0 | May 4, 2025 | Face DetectionFace Recognition | CodeCode Available | 0 |