| InternVideo: General Video Foundation Models via Generative and Discriminative Learning | Dec 6, 2022 | Action ClassificationAction Recognition | CodeCode Available | 4 | 5 |
| VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking | Mar 29, 2023 | Action ClassificationAction Recognition | CodeCode Available | 2 | 5 |
| E^2TAD: An Energy-Efficient Tracking-based Action Detector | Apr 9, 2022 | Action DetectionAction Localization | CodeCode Available | 1 | 5 |
| Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization | Jun 14, 2020 | Action DetectionAction Localization | CodeCode Available | 1 | 5 |
| 1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge 2020 | Jun 16, 2020 | Action LocalizationRelation Network | CodeCode Available | 1 | 5 |
| ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction Detection in Videos | May 25, 2021 | Action DetectionHuman-Object Interaction Anticipation | CodeCode Available | 1 | 5 |
| Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection | Apr 3, 2017 | Action ClassificationAction Localization | CodeCode Available | 0 | 5 |
| Action Tubelet Detector for Spatio-Temporal Action Localization | May 4, 2017 | Action LocalizationSpatio-Temporal Action Localization | CodeCode Available | 0 | 5 |
| Scaling Open-Vocabulary Action Detection | Apr 4, 2025 | Action DetectionMultiple Action Detection | CodeCode Available | 0 | 5 |
| Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision | Dec 9, 2021 | Action LocalizationAction Recognition | CodeCode Available | 0 | 5 |