| FrameExit: Conditional Early Exiting for Efficient Video Recognition | Apr 27, 2021 | Video RecognitionVideo Understanding | CodeCode Available | 1 | 5 |
| Frame Flexible Network | Mar 26, 2023 | Video Recognition | CodeCode Available | 1 | 5 |
| Frozen CLIP Models are Efficient Video Learners | Aug 6, 2022 | Action ClassificationDecoder | CodeCode Available | 1 | 5 |
| Adapting Short-Term Transformers for Action Detection in Untrimmed Videos | Dec 4, 2023 | Action DetectionVideo Recognition | CodeCode Available | 1 | 5 |
| Temporal-attentive Covariance Pooling Networks for Video Recognition | Oct 27, 2021 | Video Recognition | CodeCode Available | 1 | 5 |
| Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation | Jul 9, 2020 | Few-Shot Image ClassificationFew-Shot Learning | CodeCode Available | 1 | 5 |
| VG4D: Vision-Language Model Goes 4D Video Recognition | Apr 17, 2024 | Action RecognitionAutonomous Driving | CodeCode Available | 1 | 5 |
| TAM: Temporal Adaptive Module for Video Recognition | May 14, 2020 | Action RecognitionVideo Recognition | CodeCode Available | 1 | 5 |
| DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning | May 25, 2021 | Action RecognitionLong-range modeling | CodeCode Available | 1 | 5 |
| Glance and Focus Networks for Dynamic Visual Recognition | Jan 9, 2022 | image-classificationImage Classification | CodeCode Available | 1 | 5 |
| AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition | Dec 28, 2021 | Computational EfficiencyDiversity | CodeCode Available | 1 | 5 |
| TSM: Temporal Shift Module for Efficient Video Understanding | Nov 20, 2018 | 3D Action RecognitionAction Classification | CodeCode Available | 1 | 5 |
| Automated Sperm Assessment Framework and Neural Network Specialized for Sperm Video Recognition | Nov 10, 2023 | Video Recognition | CodeCode Available | 0 | 5 |
| Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition | Dec 18, 2023 | Video Recognition | CodeCode Available | 0 | 5 |
| Inter-intra Variant Dual Representations forSelf-supervised Video Recognition | Jul 2, 2021 | Contrastive LearningRepresentation Learning | CodeCode Available | 0 | 5 |
| Audiovisual SlowFast Networks for Video Recognition | Jan 23, 2020 | Action ClassificationVideo Recognition | CodeCode Available | 0 | 5 |
| Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video Recognition | Jan 11, 2024 | Video Recognition | CodeCode Available | 0 | 5 |
| Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles | Jun 1, 2023 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| Heuristic Black-box Adversarial Attacks on Video Recognition Models | Nov 21, 2019 | Adversarial AttackVideo Recognition | CodeCode Available | 0 | 5 |
| Tiny Updater: Towards Efficient Neural Network-Driven Software Updating | Jan 1, 2023 | Efficient Neural Networkimage-classification | CodeCode Available | 0 | 5 |
| HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition | Jan 10, 2024 | Action RecognitionAction Recognition In Videos | CodeCode Available | 0 | 5 |
| testRNN: Coverage-guided Testing on Recurrent Neural Networks | Jun 20, 2019 | Molecular Property PredictionProperty Prediction | CodeCode Available | 0 | 5 |
| Use Your Head: Improving Long-Tail Video Recognition | Apr 3, 2023 | Video Recognition | CodeCode Available | 0 | 5 |
| TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter | Jun 22, 2023 | Question AnsweringRetrieval | CodeCode Available | 0 | 5 |
| GenRec: Unifying Video Generation and Recognition with Diffusion Models | Aug 27, 2024 | Image to Video GenerationVideo Generation | CodeCode Available | 0 | 5 |
| ST-ABN: Visual Explanation Taking into Account Spatio-temporal Information for Video Recognition | Oct 29, 2021 | Decision MakingVideo Recognition | CodeCode Available | 0 | 5 |
| Gate-Shift-Fuse for Video Action Recognition | Mar 16, 2022 | Action RecognitionTemporal Action Localization | CodeCode Available | 0 | 5 |
| Sparse Black-box Video Attack with Reinforcement Learning | Jan 11, 2020 | reinforcement-learningReinforcement Learning | CodeCode Available | 0 | 5 |
| Spatial-temporal Concept based Explanation of 3D ConvNets | Jun 9, 2022 | Action ClassificationVideo Recognition | CodeCode Available | 0 | 5 |
| FAR: Fourier Aerial Video Recognition | Mar 21, 2022 | Action RecognitionActivity Recognition | CodeCode Available | 0 | 5 |
| Flow-Guided Feature Aggregation for Video Object Detection | Mar 29, 2017 | Objectobject-detection | CodeCode Available | 0 | 5 |
| Should I take a walk? Estimating Energy Expenditure from Video Data | Feb 1, 2022 | Video Recognition | CodeCode Available | 0 | 5 |
| Collaborative Spatiotemporal Feature Learning for Video Action Recognition | Jun 1, 2019 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| Sequence Level Semantics Aggregation for Video Object Detection | Jul 15, 2019 | ClusteringObject | CodeCode Available | 0 | 5 |
| Temporal Modeling Approaches for Large-scale Youtube-8M Video Understanding | Jul 14, 2017 | Video RecognitionVideo Understanding | CodeCode Available | 0 | 5 |
| Fast Approximate Modelling of the Next Combination Result for Stopping the Text Recognition in a Video | Aug 6, 2020 | Video Recognition | CodeCode Available | 0 | 5 |
| Collaborative Spatio-temporal Feature Learning for Video Action Recognition | Mar 4, 2019 | Action RecognitionAction Recognition In Videos | CodeCode Available | 0 | 5 |
| QTTNet: Quantized Tensor Train Neural Networks for 3D Object and Video Recognition. | Sep 20, 2021 | QuantizationVideo Recognition | CodeCode Available | 0 | 5 |
| Revisiting 3D ResNets for Video Recognition | Sep 3, 2021 | Action ClassificationContrastive Learning | CodeCode Available | 0 | 5 |
| Excitation Dropout: Encouraging Plasticity in Deep Neural Networks | May 23, 2018 | Decision MakingVideo Recognition | CodeCode Available | 0 | 5 |
| PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition | Jul 3, 2024 | PositionVideo Recognition | CodeCode Available | 0 | 5 |
| Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus on Videos | Jan 3, 2023 | Action RecognitionAdversarial Robustness | CodeCode Available | 0 | 5 |
| A^2-Nets: Double Attention Networks | Dec 1, 2018 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| Overcomplete Representations Against Adversarial Videos | Dec 8, 2020 | Adversarial RobustnessDecoder | CodeCode Available | 0 | 5 |
| On the Relevance of Temporal Features for Medical Ultrasound Video Recognition | Oct 16, 2023 | Video Recognition | CodeCode Available | 0 | 5 |
| Open-Ended Multi-Modal Relational Reasoning for Video Question Answering | Dec 1, 2020 | Question AnsweringRelational Reasoning | CodeCode Available | 0 | 5 |
| Optimization Planning for 3D ConvNets | Jan 11, 2022 | Video Recognition | CodeCode Available | 0 | 5 |
| DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition | Jul 16, 2025 | BenchmarkingKnowledge Distillation | CodeCode Available | 0 | 5 |
| Learning to Localize Temporal Events in Large-scale Video Data | Oct 25, 2019 | Temporal LocalizationVideo Recognition | CodeCode Available | 0 | 5 |
| Learning Spatio-Temporal Representation with Local and Global Diffusion | Jun 13, 2019 | Action ClassificationAction Detection | CodeCode Available | 0 | 5 |