| MAR: Masked Autoencoders for Efficient Action Recognition | Jul 24, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| In Defense of Image Pre-Training for Spatiotemporal Recognition | May 3, 2022 | GPUSTS | CodeCode Available | 1 |
| Long Movie Clip Classification with State-Space Video Models | Apr 4, 2022 | ClassificationDecoder | CodeCode Available | 1 |
| Group Contextualization for Video Recognition | Mar 18, 2022 | Action RecognitionEgocentric Activity Recognition | CodeCode Available | 1 |
| Fast Differentiable Matrix Square Root and Inverse Square Root | Jan 29, 2022 | Style TransferVideo Recognition | CodeCode Available | 1 |
| MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition | Jan 20, 2022 | Action AnticipationAction Classification | CodeCode Available | 1 |
| OCSampler: Compressing Videos to One Clip with Single-step Sampling | Jan 12, 2022 | GPUVideo Recognition | CodeCode Available | 1 |
| Glance and Focus Networks for Dynamic Visual Recognition | Jan 9, 2022 | image-classificationImage Classification | CodeCode Available | 1 |
| AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition | Dec 28, 2021 | Computational EfficiencyDiversity | CodeCode Available | 1 |
| DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition | Dec 9, 2021 | Video Recognition | CodeCode Available | 1 |
| MViTv2: Improved Multiscale Vision Transformers for Classification and Detection | Dec 2, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Pooling by Sliced-Wasserstein Embedding | Dec 1, 2021 | Graph Learningimage-classification | CodeCode Available | 1 |
| TokenLearner: Adaptive Space-Time Tokenization for Videos | Dec 1, 2021 | Representation LearningVideo Recognition | CodeCode Available | 1 |
| Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning | Nov 30, 2021 | 3D Human Pose EstimationCamera Calibration | CodeCode Available | 1 |
| Efficient Video Transformers with Spatial-Temporal Token Selection | Nov 23, 2021 | Video Recognition | CodeCode Available | 1 |
| Attacking Video Recognition Models with Bullet-Screen Comments | Oct 29, 2021 | Adversarial AttackAdversarial Attack on Video Classification | CodeCode Available | 1 |
| Temporal-attentive Covariance Pooling Networks for Video Recognition | Oct 27, 2021 | Video Recognition | CodeCode Available | 1 |
| Boosting the Transferability of Video Adversarial Examples via Temporal Translation | Oct 18, 2021 | Adversarial AttackTranslation | CodeCode Available | 1 |
| Unsupervised 3D Pose Estimation for Hierarchical Dance Video Recognition | Sep 19, 2021 | 3D Pose EstimationPose Estimation | CodeCode Available | 1 |
| Dynamic Network Quantization for Efficient Video Inference | Aug 23, 2021 | QuantizationVideo Recognition | CodeCode Available | 1 |
| Can An Image Classifier Suffice For Action Recognition? | Jun 26, 2021 | Action Recognitionimage-classification | CodeCode Available | 1 |
| Towards Long-Form Video Understanding | Jun 21, 2021 | Action RecognitionForm | CodeCode Available | 1 |
| TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? | Jun 21, 2021 | Action ClassificationImage Classification | CodeCode Available | 1 |
| Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting | Jun 18, 2021 | Action RecognitionAction Recognition In Videos | CodeCode Available | 1 |
| PyKale: Knowledge-Aware Machine Learning from Multiple Sources in Python | Jun 17, 2021 | BIG-bench Machine LearningDimensionality Reduction | CodeCode Available | 1 |
| Space-time Mixing Attention for Video Transformer | Jun 10, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Continual 3D Convolutional Neural Networks for Real-time Processing of Videos | May 31, 2021 | Action ClassificationVideo Recognition | CodeCode Available | 1 |
| DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning | May 25, 2021 | Action RecognitionLong-range modeling | CodeCode Available | 1 |
| Sharing Pain: Using Pain Domain Transfer for Video Recognition of Low Grade Orthopedic Pain in Horses | May 21, 2021 | Action RecognitionFine-grained Action Recognition | CodeCode Available | 1 |
| AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition | May 11, 2021 | Video Recognition | CodeCode Available | 1 |
| Adaptive Focus for Efficient Video Recognition | May 7, 2021 | Computational EfficiencyGPU | CodeCode Available | 1 |
| VideoLT: Large-scale Long-tailed Video Recognition | May 6, 2021 | image-classificationImage Classification | CodeCode Available | 1 |
| FrameExit: Conditional Early Exiting for Efficient Video Recognition | Apr 27, 2021 | Video RecognitionVideo Understanding | CodeCode Available | 1 |
| Multiscale Vision Transformers | Apr 22, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Visual Semantic Role Labeling for Video Understanding | Apr 2, 2021 | Semantic Role LabelingVideo Recognition | CodeCode Available | 1 |
| Learning Versatile Neural Architectures by Propagating Network Codes | Mar 24, 2021 | Image SegmentationNeural Architecture Search | CodeCode Available | 1 |
| MoViNets: Mobile Video Networks for Efficient Video Recognition | Mar 21, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| PatchNet -- Short-range Template Matching for Efficient Video Processing | Mar 10, 2021 | Objectobject-detection | CodeCode Available | 1 |
| Piano Skills Assessment | Jan 13, 2021 | Action Quality AssessmentAudio Classification | CodeCode Available | 1 |
| MVFNet: Multi-View Fusion Network for Efficient Video Recognition | Dec 13, 2020 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Learning Equivariant Representations | Dec 4, 2020 | 3D Shape ClassificationGeneral Classification | CodeCode Available | 1 |
| Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition | Oct 20, 2020 | Action RecognitionFew Shot Action Recognition | CodeCode Available | 1 |
| Dissected 3D CNNs: Temporal Skip Connections for Efficient Online Video Processing | Sep 30, 2020 | Action ClassificationVideo Recognition | CodeCode Available | 1 |
| Learning Temporally Invariant and Localizable Features via Data Augmentation for Video Recognition | Aug 13, 2020 | Action RecognitionData Augmentation | CodeCode Available | 1 |
| Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework | Aug 6, 2020 | Action Recognition In VideosContrastive Learning | CodeCode Available | 1 |
| RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition | Aug 1, 2020 | Action RecognitionTemporal Action Localization | CodeCode Available | 1 |
| Adversarial Bipartite Graph Learning for Video Domain Adaptation | Jul 31, 2020 | Domain AdaptationGraph Learning | CodeCode Available | 1 |
| Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation | Jul 9, 2020 | Few-Shot Image ClassificationFew-Shot Learning | CodeCode Available | 1 |
| Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition | Jun 20, 2020 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Video Panoptic Segmentation | Jun 19, 2020 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 1 |