| HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition | Jan 10, 2024 | Action RecognitionAction Recognition In Videos | CodeCode Available | 0 |
| PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition | Jul 3, 2024 | PositionVideo Recognition | CodeCode Available | 0 |
| Heuristic Black-box Adversarial Attacks on Video Recognition Models | Nov 21, 2019 | Adversarial AttackVideo Recognition | CodeCode Available | 0 |
| Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles | Jun 1, 2023 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| ST-ABN: Visual Explanation Taking into Account Spatio-temporal Information for Video Recognition | Oct 29, 2021 | Decision MakingVideo Recognition | CodeCode Available | 0 |
| Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video Recognition | Jan 11, 2024 | Video Recognition | CodeCode Available | 0 |
| VidConv: A modernized 2D ConvNet for Efficient Video Recognition | Jul 8, 2022 | Action RecognitionVideo Recognition | CodeCode Available | 0 |
| Collaborative Spatio-temporal Feature Learning for Video Action Recognition | Mar 4, 2019 | Action RecognitionAction Recognition In Videos | CodeCode Available | 0 |
| TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter | Jun 22, 2023 | Question AnsweringRetrieval | CodeCode Available | 0 |
| Inter-intra Variant Dual Representations forSelf-supervised Video Recognition | Jul 2, 2021 | Contrastive LearningRepresentation Learning | CodeCode Available | 0 |
| Collaborative Spatiotemporal Feature Learning for Video Action Recognition | Jun 1, 2019 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| Adaptive occlusion sensitivity analysis for visually explaining video recognition networks | Jul 26, 2022 | Decision Makingimage-classification | CodeCode Available | 0 |
| A^2-Nets: Double Attention Networks | Dec 1, 2018 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| QTTNet: Quantized Tensor Train Neural Networks for 3D Object and Video Recognition. | Sep 20, 2021 | QuantizationVideo Recognition | CodeCode Available | 0 |
| Multi-Modal Multi-Action Video Recognition | Jan 1, 2021 | RelationVideo Recognition | CodeCode Available | 0 |
| Don't Judge by the Look: Towards Motion Coherent Video Representation | Mar 14, 2024 | Data AugmentationObject Recognition | CodeCode Available | 0 |
| Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition | Dec 18, 2023 | Video Recognition | CodeCode Available | 0 |
| VTD-CLIP: Video-to-Text Discretization via Prompting CLIP | Mar 24, 2025 | parameter-efficient fine-tuningVideo Recognition | CodeCode Available | 0 |
| DriftNet: Aggressive Driving Behavior Classification using 3D EfficientNet Architecture | Apr 18, 2020 | Anomaly DetectionClassification | CodeCode Available | 0 |
| VideoPure: Diffusion-based Adversarial Purification for Video Recognition | Jan 25, 2025 | Adversarial DefenseAdversarial Purification | CodeCode Available | 0 |
| Revisiting 3D ResNets for Video Recognition | Sep 3, 2021 | Action ClassificationContrastive Learning | CodeCode Available | 0 |
| Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution | Apr 10, 2019 | Action ClassificationImage Classification | CodeCode Available | 0 |
| Video Transformer Network | Feb 1, 2021 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models | May 13, 2025 | FormMultiple-choice | CodeCode Available | 0 |
| Object-centric Video Representation for Long-term Action Anticipation | Oct 31, 2023 | Action AnticipationHuman-Object Interaction Detection | CodeCode Available | 0 |
| Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism | Jul 24, 2022 | ObjectObject State Change Classification | CodeCode Available | 0 |
| Audiovisual SlowFast Networks for Video Recognition | Jan 23, 2020 | Action ClassificationVideo Recognition | CodeCode Available | 0 |
| Automated Sperm Assessment Framework and Neural Network Specialized for Sperm Video Recognition | Nov 10, 2023 | Video Recognition | CodeCode Available | 0 |
| DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition | Jul 16, 2025 | BenchmarkingKnowledge Distillation | CodeCode Available | 0 |
| Sequence Level Semantics Aggregation for Video Object Detection | Jul 15, 2019 | ClusteringObject | CodeCode Available | 0 |
| Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition | Aug 22, 2023 | Multiview LearningVideo Recognition | CodeCode Available | 0 |
| Learning Spatio-Temporal Representation with Local and Global Diffusion | Jun 13, 2019 | Action ClassificationAction Detection | CodeCode Available | 0 |
| Temporal Modeling Approaches for Large-scale Youtube-8M Video Understanding | Jul 14, 2017 | Video RecognitionVideo Understanding | CodeCode Available | 0 |
| Learning to Localize Temporal Events in Large-scale Video Data | Oct 25, 2019 | Temporal LocalizationVideo Recognition | CodeCode Available | 0 |
| Use Your Head: Improving Long-Tail Video Recognition | Apr 3, 2023 | Video Recognition | CodeCode Available | 0 |
| On the Relevance of Temporal Features for Medical Ultrasound Video Recognition | Oct 16, 2023 | Video Recognition | CodeCode Available | 0 |
| Excitation Dropout: Encouraging Plasticity in Deep Neural Networks | May 23, 2018 | Decision MakingVideo Recognition | CodeCode Available | 0 |
| Open-Ended Multi-Modal Relational Reasoning for Video Question Answering | Dec 1, 2020 | Question AnsweringRelational Reasoning | CodeCode Available | 0 |
| LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer | Dec 15, 2023 | reinforcement-learningReinforcement Learning | CodeCode Available | 0 |
| Optimization Planning for 3D ConvNets | Jan 11, 2022 | Video Recognition | CodeCode Available | 0 |
| Long-term Recurrent Convolutional Networks for Visual Recognition and Description | Nov 17, 2014 | Image DescriptionRetrieval | CodeCode Available | 0 |
| Orthogonal Temporal Interpolation for Zero-Shot Video Recognition | Aug 14, 2023 | Video RecognitionZero-Shot Action Recognition | CodeCode Available | 0 |
| Micro-Batch Training with Batch-Channel Normalization and Weight Standardization | Mar 25, 2019 | GPUimage-classification | CodeCode Available | 0 |
| testRNN: Coverage-guided Testing on Recurrent Neural Networks | Jun 20, 2019 | Molecular Property PredictionProperty Prediction | CodeCode Available | 0 |
| Overcomplete Representations Against Adversarial Videos | Dec 8, 2020 | Adversarial RobustnessDecoder | CodeCode Available | 0 |
| GenRec: Unifying Video Generation and Recognition with Diffusion Models | Aug 27, 2024 | Image to Video GenerationVideo Generation | CodeCode Available | 0 |
| Gate-Shift-Fuse for Video Action Recognition | Mar 16, 2022 | Action RecognitionTemporal Action Localization | CodeCode Available | 0 |
| Flow-Guided Feature Aggregation for Video Object Detection | Mar 29, 2017 | Objectobject-detection | CodeCode Available | 0 |
| FAR: Fourier Aerial Video Recognition | Mar 21, 2022 | Action RecognitionActivity Recognition | CodeCode Available | 0 |
| Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus on Videos | Jan 3, 2023 | Action RecognitionAdversarial Robustness | CodeCode Available | 0 |