| FrameExit: Conditional Early Exiting for Efficient Video Recognition | Apr 27, 2021 | Video RecognitionVideo Understanding | CodeCode Available | 1 | 5 |
| Frame Flexible Network | Mar 26, 2023 | Video Recognition | CodeCode Available | 1 | 5 |
| Frozen CLIP Models are Efficient Video Learners | Aug 6, 2022 | Action ClassificationDecoder | CodeCode Available | 1 | 5 |
| Adapting Short-Term Transformers for Action Detection in Untrimmed Videos | Dec 4, 2023 | Action DetectionVideo Recognition | CodeCode Available | 1 | 5 |
| Space-time Mixing Attention for Video Transformer | Jun 10, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 | 5 |
| Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation | Jul 9, 2020 | Few-Shot Image ClassificationFew-Shot Learning | CodeCode Available | 1 | 5 |
| Audio-Visual Class-Incremental Learning | Aug 21, 2023 | class-incremental learningClass Incremental Learning | CodeCode Available | 1 | 5 |
| BASKET: A Large-Scale Video Dataset for Fine-Grained Skill Estimation | Mar 26, 2025 | Video Recognition | CodeCode Available | 1 | 5 |
| SVFormer: Semi-supervised Video Transformer for Action Recognition | Nov 23, 2022 | Action Recognitionimage-classification | CodeCode Available | 1 | 5 |
| Glance and Focus Networks for Dynamic Visual Recognition | Jan 9, 2022 | image-classificationImage Classification | CodeCode Available | 1 | 5 |
| Group Contextualization for Video Recognition | Mar 18, 2022 | Action RecognitionEgocentric Activity Recognition | CodeCode Available | 1 | 5 |
| TSM: Temporal Shift Module for Efficient Video Understanding | Nov 20, 2018 | 3D Action RecognitionAction Classification | CodeCode Available | 1 | 5 |
| Automated Sperm Assessment Framework and Neural Network Specialized for Sperm Video Recognition | Nov 10, 2023 | Video Recognition | CodeCode Available | 0 | 5 |
| Training Kinetics in 15 Minutes: Large-scale Distributed Training on Videos | Oct 1, 2019 | GPUVideo Recognition | CodeCode Available | 0 | 5 |
| Inter-intra Variant Dual Representations forSelf-supervised Video Recognition | Jul 2, 2021 | Contrastive LearningRepresentation Learning | CodeCode Available | 0 | 5 |
| Audiovisual SlowFast Networks for Video Recognition | Jan 23, 2020 | Action ClassificationVideo Recognition | CodeCode Available | 0 | 5 |
| Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video Recognition | Jan 11, 2024 | Video Recognition | CodeCode Available | 0 | 5 |
| testRNN: Coverage-guided Testing on Recurrent Neural Networks | Jun 20, 2019 | Molecular Property PredictionProperty Prediction | CodeCode Available | 0 | 5 |
| Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles | Jun 1, 2023 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| Heuristic Black-box Adversarial Attacks on Video Recognition Models | Nov 21, 2019 | Adversarial AttackVideo Recognition | CodeCode Available | 0 | 5 |
| Coverage Guided Testing for Recurrent Neural Networks | Nov 5, 2019 | Defect DetectionDrug Discovery | CodeCode Available | 0 | 5 |
| Tiny Updater: Towards Efficient Neural Network-Driven Software Updating | Jan 1, 2023 | Efficient Neural Networkimage-classification | CodeCode Available | 0 | 5 |
| HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition | Jan 10, 2024 | Action RecognitionAction Recognition In Videos | CodeCode Available | 0 | 5 |
| Temporal Modeling Approaches for Large-scale Youtube-8M Video Understanding | Jul 14, 2017 | Video RecognitionVideo Understanding | CodeCode Available | 0 | 5 |
| Temporal superimposed crossover module for effective continuous sign language | Nov 7, 2022 | image-classificationImage Classification | CodeCode Available | 0 | 5 |
| ST-ABN: Visual Explanation Taking into Account Spatio-temporal Information for Video Recognition | Oct 29, 2021 | Decision MakingVideo Recognition | CodeCode Available | 0 | 5 |
| GenRec: Unifying Video Generation and Recognition with Diffusion Models | Aug 27, 2024 | Image to Video GenerationVideo Generation | CodeCode Available | 0 | 5 |
| Spatial-temporal Concept based Explanation of 3D ConvNets | Jun 9, 2022 | Action ClassificationVideo Recognition | CodeCode Available | 0 | 5 |
| TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter | Jun 22, 2023 | Question AnsweringRetrieval | CodeCode Available | 0 | 5 |
| Gate-Shift-Fuse for Video Action Recognition | Mar 16, 2022 | Action RecognitionTemporal Action Localization | CodeCode Available | 0 | 5 |
| Sparse Black-box Video Attack with Reinforcement Learning | Jan 11, 2020 | reinforcement-learningReinforcement Learning | CodeCode Available | 0 | 5 |
| FAR: Fourier Aerial Video Recognition | Mar 21, 2022 | Action RecognitionActivity Recognition | CodeCode Available | 0 | 5 |
| Flow-Guided Feature Aggregation for Video Object Detection | Mar 29, 2017 | Objectobject-detection | CodeCode Available | 0 | 5 |
| Collaborative Spatiotemporal Feature Learning for Video Action Recognition | Jun 1, 2019 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| Sequence Level Semantics Aggregation for Video Object Detection | Jul 15, 2019 | ClusteringObject | CodeCode Available | 0 | 5 |
| Should I take a walk? Estimating Energy Expenditure from Video Data | Feb 1, 2022 | Video Recognition | CodeCode Available | 0 | 5 |
| Fast Approximate Modelling of the Next Combination Result for Stopping the Text Recognition in a Video | Aug 6, 2020 | Video Recognition | CodeCode Available | 0 | 5 |
| Collaborative Spatio-temporal Feature Learning for Video Action Recognition | Mar 4, 2019 | Action RecognitionAction Recognition In Videos | CodeCode Available | 0 | 5 |
| QTTNet: Quantized Tensor Train Neural Networks for 3D Object and Video Recognition. | Sep 20, 2021 | QuantizationVideo Recognition | CodeCode Available | 0 | 5 |
| Excitation Dropout: Encouraging Plasticity in Deep Neural Networks | May 23, 2018 | Decision MakingVideo Recognition | CodeCode Available | 0 | 5 |
| Revisiting 3D ResNets for Video Recognition | Sep 3, 2021 | Action ClassificationContrastive Learning | CodeCode Available | 0 | 5 |
| PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition | Jul 3, 2024 | PositionVideo Recognition | CodeCode Available | 0 | 5 |
| Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus on Videos | Jan 3, 2023 | Action RecognitionAdversarial Robustness | CodeCode Available | 0 | 5 |
| Overcomplete Representations Against Adversarial Videos | Dec 8, 2020 | Adversarial RobustnessDecoder | CodeCode Available | 0 | 5 |
| A^2-Nets: Double Attention Networks | Dec 1, 2018 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| On the Relevance of Temporal Features for Medical Ultrasound Video Recognition | Oct 16, 2023 | Video Recognition | CodeCode Available | 0 | 5 |
| Open-Ended Multi-Modal Relational Reasoning for Video Question Answering | Dec 1, 2020 | Question AnsweringRelational Reasoning | CodeCode Available | 0 | 5 |
| Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism | Jul 24, 2022 | ObjectObject State Change Classification | CodeCode Available | 0 | 5 |
| DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition | Jul 16, 2025 | BenchmarkingKnowledge Distillation | CodeCode Available | 0 | 5 |
| Learning to Localize Temporal Events in Large-scale Video Data | Oct 25, 2019 | Temporal LocalizationVideo Recognition | CodeCode Available | 0 | 5 |