| UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning | Sep 29, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| roadscene2vec: A Tool for Extracting and Embedding Road Scene-Graphs | Sep 2, 2021 | Action ClassificationGraph Embedding | CodeCode Available | 1 | 5 |
| MViTv2: Improved Multiscale Vision Transformers for Classification and Detection | Dec 2, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition | Mar 19, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| A Closer Look at Spatiotemporal Convolutions for Action Recognition | Nov 30, 2017 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Actor-agnostic Multi-label Action Recognition with Multi-modal Query | Jul 20, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformers | Jun 15, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Revisiting ResNets: Improved Training and Scaling Strategies | Mar 13, 2021 | Action ClassificationDocument Image Classification | CodeCode Available | 1 | 5 |
| Infrared and 3D skeleton feature fusion for RGB-D action recognition | Feb 28, 2020 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Revisiting spatio-temporal layouts for compositional action recognition | Nov 2, 2021 | Action ClassificationAction Detection | CodeCode Available | 1 | 5 |
| Self-supervised Video Transformer | Dec 2, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Mutual Modality Learning for Video Action Classification | Nov 4, 2020 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning | Jun 27, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Visual Semantic Role Labeling | May 17, 2015 | 16kAction Classification | CodeCode Available | 1 | 5 |
| AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders | Nov 16, 2022 | Action ClassificationRepresentation Learning | CodeCode Available | 1 | 5 |
| What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment | Apr 8, 2019 | Action ClassificationAction Quality Assessment | CodeCode Available | 1 | 5 |
| Autoregressive Adaptive Hypergraph Transformer for Skeleton-based Activity Recognition | Nov 8, 2024 | Action ClassificationActivity Recognition | CodeCode Available | 1 | 5 |
| Dual-path Adaptation from Image to Video Transformers | Mar 17, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| ReAct: Temporal Action Detection with Relational Queries | Jul 14, 2022 | Action ClassificationAction Detection | CodeCode Available | 1 | 5 |
| Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition | Jun 20, 2020 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| BABEL: Bodies, Action and Behavior with English Labels | Jun 17, 2021 | 3D Action RecognitionAction Classification | CodeCode Available | 1 | 5 |
| XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning | Nov 25, 2022 | Action ClassificationClassification | CodeCode Available | 1 | 5 |
| Region-based Non-local Operation for Video Classification | Jul 17, 2020 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization | Aug 12, 2024 | Action ClassificationAction Localization | CodeCode Available | 1 | 5 |
| EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding | Jun 13, 2024 | Action ClassificationAction Localization | CodeCode Available | 1 | 5 |
| Enriching Local and Global Contexts for Temporal Action Localization | Jul 27, 2021 | Action ClassificationAction Localization | CodeCode Available | 1 | 5 |
| Delving Deep into One-Shot Skeleton-based Action Recognition with Diverse Occlusions | Feb 23, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| ViViT: A Video Vision Transformer | Mar 29, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Post-Processing Temporal Action Detection | Nov 27, 2022 | Action ClassificationAction Detection | CodeCode Available | 1 | 5 |
| Proposal Relation Network for Temporal Action Detection | Jun 20, 2021 | Action ClassificationAction Detection | CodeCode Available | 1 | 5 |
| Representation Learning via Global Temporal Alignment and Cycle-Consistency | May 11, 2021 | Action ClassificationDynamic Time Warping | CodeCode Available | 1 | 5 |
| EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition | Aug 10, 2024 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset | May 22, 2017 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning | Nov 27, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Boundary-sensitive Pre-training for Temporal Localization in Videos | Nov 21, 2020 | Action ClassificationClassification | CodeCode Available | 1 | 5 |
| High Quality Monocular Depth Estimation via Transfer Learning | Dec 31, 2018 | Action ClassificationDecoder | CodeCode Available | 1 | 5 |
| HierVL: Learning Hierarchical Video-Language Embeddings | Jan 5, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning | Dec 6, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Finding the Missing Data: A BERT-inspired Approach Against Package Loss in Wireless Sensing | Mar 19, 2024 | Action ClassificationDeep Learning | CodeCode Available | 1 | 5 |
| BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues | Jul 23, 2020 | Action ClassificationKeyword Spotting | CodeCode Available | 1 | 5 |
| Large Scale Holistic Video Understanding | Apr 25, 2019 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Implicit Temporal Modeling with Learnable Alignment for Video Recognition | Apr 20, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Stand-Alone Inter-Frame Attention in Video Models | Jun 14, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Back to the Future: Cycle Encoding Prediction for Self-supervised Contrastive Video Representation Learning | Oct 14, 2020 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| Do we really need temporal convolutions in action segmentation? | May 26, 2022 | Action ClassificationAction Segmentation | CodeCode Available | 0 | 5 |
| Object Priors for Classifying and Localizing Unseen Actions | Apr 10, 2021 | Action ClassificationAction Localization | CodeCode Available | 0 | 5 |
| OccludeNet: A Causal Journey into Mixed-View Actor-Centric Video Action Recognition under Occlusions | Nov 24, 2024 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| ECO: Efficient Convolutional Network for Online Video Understanding | Apr 24, 2018 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis | Apr 11, 2016 | 3D Action RecognitionAction Classification | CodeCode Available | 0 | 5 |
| Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution | Apr 10, 2019 | Action ClassificationImage Classification | CodeCode Available | 0 | 5 |