| Revisiting Classifier: Transferring Vision-Language Models for Video Recognition | Jul 4, 2022 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking | Mar 29, 2023 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| Omnivore: A Single Model for Many Visual Modalities | Jan 20, 2022 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer | Sep 22, 2022 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| Omni-sourced Webly-supervised Learning for Video Recognition | Mar 29, 2020 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| Is Space-Time Attention All You Need for Video Understanding? | Feb 9, 2021 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| Learning Video Representations from Large Language Models | Dec 8, 2022 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| AIM: Adapting Image Models for Efficient Video Action Recognition | Feb 6, 2023 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| MARLIN: Masked Autoencoder for facial video Representation LearnINg | Nov 12, 2022 | Action ClassificationAttribute | CodeCode Available | 2 |
| Temporal Segment Networks for Action Recognition in Videos | May 8, 2017 | Action ClassificationAction Recognition | CodeCode Available | 2 |