| BABEL: Bodies, Action and Behavior with English Labels | Jun 17, 2021 | 3D Action RecognitionAction Classification | CodeCode Available | 1 |
| TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks | Nov 23, 2020 | Action ClassificationAction Localization | CodeCode Available | 1 |
| Enriching Local and Global Contexts for Temporal Action Localization | Jul 27, 2021 | Action ClassificationAction Localization | CodeCode Available | 1 |
| HierVL: Learning Hierarchical Video-Language Embeddings | Jan 5, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding | Jun 13, 2024 | Action ClassificationAction Localization | CodeCode Available | 1 |
| UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition | Jul 19, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| MViTv2: Improved Multiscale Vision Transformers for Classification and Detection | Dec 2, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Florence: A New Foundation Model for Computer Vision | Nov 22, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| ViViT: A Video Vision Transformer | Mar 29, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Implicit Temporal Modeling with Learnable Alignment for Video Recognition | Apr 20, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |