| Open-Vocabulary Video Relation Extraction | Dec 25, 2023 | Action ClassificationAction Understanding | CodeCode Available | 1 |
| TSM: Temporal Shift Module for Efficient Video Understanding | Nov 20, 2018 | 3D Action RecognitionAction Classification | CodeCode Available | 1 |
| MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition | Jan 20, 2022 | Action AnticipationAction Classification | CodeCode Available | 1 |
| DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition | Mar 19, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| A Closer Look at Spatiotemporal Convolutions for Action Recognition | Nov 30, 2017 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Timeception for Complex Action Recognition | Dec 4, 2018 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Dissected 3D CNNs: Temporal Skip Connections for Efficient Online Video Processing | Sep 30, 2020 | Action ClassificationVideo Recognition | CodeCode Available | 1 |
| Masked Feature Prediction for Self-Supervised Visual Pre-Training | Dec 16, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| A Simple and Efficient Pipeline to Build an End-to-End Spatial-Temporal Action Detector | Jun 7, 2022 | Action ClassificationAction Detection | CodeCode Available | 1 |
| UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning | Sep 29, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Let's Play for Action: Recognizing Activities of Daily Living by Learning from Life Simulation Video Games | Jul 12, 2021 | Action ClassificationActivity Recognition | CodeCode Available | 1 |
| Boundary-sensitive Pre-training for Temporal Localization in Videos | Nov 21, 2020 | Action ClassificationClassification | CodeCode Available | 1 |
| MAR: Masked Autoencoders for Efficient Action Recognition | Jul 24, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Video Contrastive Learning with Global Context | Aug 5, 2021 | Action ClassificationAction Localization | CodeCode Available | 1 |
| MMNet: A Model-Based Multimodal Network for Human Action Recognition in RGB-D Videos | May 26, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification | Mar 17, 2020 | Action ClassificationClassification | CodeCode Available | 1 |
| Autoregressive Adaptive Hypergraph Transformer for Skeleton-based Activity Recognition | Nov 8, 2024 | Action ClassificationActivity Recognition | CodeCode Available | 1 |
| Dual-path Adaptation from Image to Video Transformers | Mar 17, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers | Jun 9, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living | May 17, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| BABEL: Bodies, Action and Behavior with English Labels | Jun 17, 2021 | 3D Action RecognitionAction Classification | CodeCode Available | 1 |
| What Can Simple Arithmetic Operations Do for Temporal Modeling? | Jul 18, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Weakly-supervised Temporal Action Localization by Uncertainty Modeling | Jun 12, 2020 | Action ClassificationAction Localization | CodeCode Available | 1 |
| XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning | Nov 25, 2022 | Action ClassificationClassification | CodeCode Available | 1 |
| EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding | Jun 13, 2024 | Action ClassificationAction Localization | CodeCode Available | 1 |
| ALIP: Adaptive Language-Image Pre-training with Synthetic Caption | Aug 16, 2023 | Action ClassificationImage-text Retrieval | CodeCode Available | 1 |
| Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition | Aug 10, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Enriching Local and Global Contexts for Temporal Action Localization | Jul 27, 2021 | Action ClassificationAction Localization | CodeCode Available | 1 |
| HierVL: Learning Hierarchical Video-Language Embeddings | Jan 5, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| ViViT: A Video Vision Transformer | Mar 29, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Just Add π! Pose Induced Video Transformers for Understanding Activities of Daily Living | Nov 30, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition | Aug 10, 2024 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| KNN-MMD: Cross Domain Wireless Sensing via Local Distribution Alignment | Dec 6, 2024 | Action ClassificationAction Classification (1-shot) | CodeCode Available | 1 |
| Learning Spatiotemporal Features via Video and Text Pair Discrimination | Jan 16, 2020 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| MotionSqueeze: Neural Motion Feature Learning for Video Understanding | Jul 20, 2020 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Learning To Recognize Procedural Activities with Distant Supervision | Jan 26, 2022 | Action ClassificationLanguage Modelling | CodeCode Available | 1 |
| ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning | Jun 27, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Make Your Training Flexible: Towards Deployment-Efficient Video Models | Mar 18, 2025 | Action ClassificationZero-Shot Video Retrieval | CodeCode Available | 1 |
| Finding the Missing Data: A BERT-inspired Approach Against Package Loss in Wireless Sensing | Mar 19, 2024 | Action ClassificationDeep Learning | CodeCode Available | 1 |
| BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues | Jul 23, 2020 | Action ClassificationKeyword Spotting | CodeCode Available | 1 |
| Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning | Dec 8, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Memory-augmented Dense Predictive Coding for Video Representation Learning | Aug 3, 2020 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos | Apr 12, 2018 | Action ClassificationAction Detection | CodeCode Available | 1 |
| End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding | Jan 29, 2018 | Action ClassificationAction Segmentation | —Unverified | 0 |
| Adaptive Intermediate Representations for Video Understanding | Apr 14, 2021 | Action ClassificationOptical Flow Estimation | —Unverified | 0 |
| Egocentric Audio-Visual Noise Suppression | Nov 7, 2022 | Action ClassificationEvent Detection | —Unverified | 0 |
| ActionBytes: Learning From Trimmed Videos to Localize Actions | Jun 1, 2020 | Action ClassificationAction Localization | —Unverified | 0 |
| IMUVIE: Pickup Timeline Action Localization via Motion Movies | Nov 19, 2024 | Action ClassificationAction Localization | —Unverified | 0 |
| Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification | Aug 31, 2016 | 3D ArchitectureAction Classification | —Unverified | 0 |
| Efficient Optimization for Average Precision SVM | Dec 1, 2014 | Action ClassificationGeneral Classification | —Unverified | 0 |