| Multi-Modal Multi-Action Video Recognition | Jan 1, 2021 | RelationVideo Recognition | CodeCode Available | 0 | 5 |
| Fast Approximate Modelling of the Next Combination Result for Stopping the Text Recognition in a Video | Aug 6, 2020 | Video Recognition | CodeCode Available | 0 | 5 |
| Learning to Localize Temporal Events in Large-scale Video Data | Oct 25, 2019 | Temporal LocalizationVideo Recognition | CodeCode Available | 0 | 5 |
| Collaborative Spatio-temporal Feature Learning for Video Action Recognition | Mar 4, 2019 | Action RecognitionAction Recognition In Videos | CodeCode Available | 0 | 5 |
| Collaborative Spatiotemporal Feature Learning for Video Action Recognition | Jun 1, 2019 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| Inter-intra Variant Dual Representations forSelf-supervised Video Recognition | Jul 2, 2021 | Contrastive LearningRepresentation Learning | CodeCode Available | 0 | 5 |
| Flow-Guided Feature Aggregation for Video Object Detection | Mar 29, 2017 | Objectobject-detection | CodeCode Available | 0 | 5 |
| FAR: Fourier Aerial Video Recognition | Mar 21, 2022 | Action RecognitionActivity Recognition | CodeCode Available | 0 | 5 |
| Sparse Black-box Video Attack with Reinforcement Learning | Jan 11, 2020 | reinforcement-learningReinforcement Learning | CodeCode Available | 0 | 5 |
| Spatial-temporal Concept based Explanation of 3D ConvNets | Jun 9, 2022 | Action ClassificationVideo Recognition | CodeCode Available | 0 | 5 |
| VTD-CLIP: Video-to-Text Discretization via Prompting CLIP | Mar 24, 2025 | parameter-efficient fine-tuningVideo Recognition | CodeCode Available | 0 | 5 |
| Revisiting 3D ResNets for Video Recognition | Sep 3, 2021 | Action ClassificationContrastive Learning | CodeCode Available | 0 | 5 |
| Gate-Shift-Fuse for Video Action Recognition | Mar 16, 2022 | Action RecognitionTemporal Action Localization | CodeCode Available | 0 | 5 |
| ST-ABN: Visual Explanation Taking into Account Spatio-temporal Information for Video Recognition | Oct 29, 2021 | Decision MakingVideo Recognition | CodeCode Available | 0 | 5 |
| HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition | Jan 10, 2024 | Action RecognitionAction Recognition In Videos | CodeCode Available | 0 | 5 |
| GenRec: Unifying Video Generation and Recognition with Diffusion Models | Aug 27, 2024 | Image to Video GenerationVideo Generation | CodeCode Available | 0 | 5 |
| PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition | Jul 3, 2024 | PositionVideo Recognition | CodeCode Available | 0 | 5 |
| VideoPure: Diffusion-based Adversarial Purification for Video Recognition | Jan 25, 2025 | Adversarial DefenseAdversarial Purification | CodeCode Available | 0 | 5 |
| MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing | Jun 13, 2022 | 3D ArchitectureAction Classification | CodeCode Available | 0 | 5 |
| Adaptive Detrending to Accelerate Convolutional Gated Recurrent Unit Training for Contextual Video Recognition | May 24, 2017 | Video Recognition | —Unverified | 0 | 0 |
| On the Importance of Spatial Relations for Few-shot Action Recognition | Aug 14, 2023 | Action RecognitionFew-Shot action recognition | —Unverified | 0 | 0 |
| On the Pitfalls of Learning with Limited Data: A Facial Expression Recognition Case Study | Apr 2, 2021 | Data AugmentationDeep Learning | —Unverified | 0 | 0 |
| On the Surprising Effectiveness of Transformers in Low-Labeled Video Recognition | Sep 15, 2022 | image-classificationImage Classification | —Unverified | 0 | 0 |
| PA3D: Pose-Action 3D Machine for Video Recognition | Jun 1, 2019 | Action RecognitionOptical Flow Estimation | —Unverified | 0 | 0 |
| Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition | Mar 17, 2025 | Action RecognitionVideo Recognition | —Unverified | 0 | 0 |
| Training Kinetics in 15 Minutes: Large-scale Distributed Training on Videos | Oct 1, 2019 | GPUVideo Recognition | —Unverified | 0 | 0 |
| Transfer Learning for Video Recognition with Scarce Training Data for Deep Convolutional Neural Network | Sep 15, 2014 | 4kTransfer Learning | —Unverified | 0 | 0 |
| Attention Distillation for Learning Video Representations | Apr 5, 2019 | Action RecognitionVideo Recognition | —Unverified | 0 | 0 |
| Percept, Chat, and then Adapt: Multimodal Knowledge Transfer of Foundation Models for Open-World Video Recognition | Feb 29, 2024 | Transfer LearningVideo Recognition | —Unverified | 0 | 0 |
| Phase-Specific Augmented Reality Guidance for Microscopic Cataract Surgery Using Long-Short Spatiotemporal Aggregation Transformer | Sep 11, 2023 | Multi-Task LearningVideo Recognition | —Unverified | 0 | 0 |
| Transfer-LMR: Heavy-Tail Driving Behavior Recognition in Diverse Traffic Scenarios | May 8, 2024 | Video Recognition | —Unverified | 0 | 0 |
| Action Keypoint Network for Efficient Video Recognition | Jan 17, 2022 | Action RecognitionPoint Cloud Classification | —Unverified | 0 | 0 |
| A^2-Nets: Double Attention Networks | Oct 27, 2018 | 3D Absolute Human Pose EstimationAction Classification | —Unverified | 0 | 0 |
| Purification Of Contaminated Convolutional Neural Networks Via Robust Recovery: An Approach with Theoretical Guarantee in One-Hidden-Layer Case | Jul 4, 2024 | image-classificationImage Classification | —Unverified | 0 | 0 |
| PV-NAS: Practical Neural Architecture Search for Video Recognition | Nov 2, 2020 | Neural Architecture SearchVideo Recognition | —Unverified | 0 | 0 |
| Action Detail Matters: Refining Video Recognition with Local Action Queries | Jan 1, 2025 | Action RecognitionTemporal Action Localization | —Unverified | 0 | 0 |
| A Simple and Efficient Baseline for Video Action Recognition | Mar 2, 2025 | Action RecognitionFine-grained Action Recognition | —Unverified | 0 | 0 |
| Attention Transfer from Web Images for Video Recognition | Aug 3, 2017 | Action RecognitionTemporal Action Localization | —Unverified | 0 | 0 |
| A two-way translation system of Chinese sign language based on computer vision | Jun 3, 2023 | SentenceSign Language Recognition | —Unverified | 0 | 0 |
| Recognizing Actions in Videos from Unseen Viewpoints | Mar 30, 2021 | Action ClassificationAction Recognition | —Unverified | 0 | 0 |
| Audio-Visual Fusion Layers for Event Type Aware Video Recognition | Feb 12, 2022 | Multi-Task LearningVideo Recognition | —Unverified | 0 | 0 |
| Audio-Visual Glance Network for Efficient Video Recognition | Aug 18, 2023 | Video RecognitionVideo Understanding | —Unverified | 0 | 0 |
| Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search | Dec 9, 2021 | Neural Architecture SearchVideo Recognition | —Unverified | 0 | 0 |
| A Video Recognition Method by using Adaptive Structural Learning of Long Short Term Memory based Deep Belief Network | Sep 30, 2019 | Time SeriesTime Series Analysis | —Unverified | 0 | 0 |
| Recurrent Residual Module for Fast Inference in Videos | Feb 27, 2018 | object-detectionObject Detection | —Unverified | 0 | 0 |
| A robust and efficient video representation for action recognition | Apr 21, 2015 | Action RecognitionHomography Estimation | —Unverified | 0 | 0 |
| Black-box Adversarial Attacks on Video Recognition Models | Apr 10, 2019 | Video Recognition | —Unverified | 0 | 0 |
| Recurring the Transformer for Video Action Recognition | Jan 1, 2022 | Action RecognitionGPU | —Unverified | 0 | 0 |
| BosphorusSign22k Sign Language Recognition Dataset | Apr 2, 2020 | Sign Language ProductionSign Language Recognition | —Unverified | 0 | 0 |
| 11 TeraFLOPs per second photonic convolutional accelerator for deep learning optical neural networks | Nov 14, 2020 | Board GamesMedical Diagnosis | —Unverified | 0 | 0 |