| InternVideo: General Video Foundation Models via Generative and Discriminative Learning | Dec 6, 2022 | Action ClassificationAction Recognition | CodeCode Available | 4 |
| VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking | Mar 29, 2023 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization | Jun 14, 2020 | Action DetectionAction Localization | CodeCode Available | 1 |
| ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction Detection in Videos | May 25, 2021 | Action DetectionHuman-Object Interaction Anticipation | CodeCode Available | 1 |
| E^2TAD: An Energy-Efficient Tracking-based Action Detector | Apr 9, 2022 | Action DetectionAction Localization | CodeCode Available | 1 |
| 1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge 2020 | Jun 16, 2020 | Action LocalizationRelation Network | CodeCode Available | 1 |
| Action is in the Eye of the Beholder: Eye-gaze Driven Model for Spatio-Temporal Action Localization | Dec 1, 2013 | Action LocalizationClassification | —Unverified | 0 |
| CFAD: Coarse-to-Fine Action Detector for Spatiotemporal Action Localization | Aug 19, 2020 | Action DetectionAction Localization | —Unverified | 0 |
| Learning to track for spatio-temporal action localization | Jun 5, 2015 | Action LocalizationSpatio-Temporal Action Localization | —Unverified | 0 |
| End-to-End Spatio-Temporal Action Localisation with Video Transformers | Apr 24, 2023 | Action DetectionAction Recognition | —Unverified | 0 |