| Real-time Online Video Detection with Temporal Smoothing Transformers | Sep 19, 2022 | Action AnticipationAction Detection | CodeCode Available | 1 |
| On the Surprising Effectiveness of Transformers in Low-Labeled Video Recognition | Sep 15, 2022 | image-classificationImage Classification | —Unverified | 0 |
| Video Mobile-Former: Video Recognition with Efficient Global Spatial-temporal Modeling | Aug 25, 2022 | Video Recognition | —Unverified | 0 |
| Efficient Attention-free Video Shift Transformers | Aug 23, 2022 | Action RecognitionVideo Recognition | —Unverified | 0 |
| Frozen CLIP Models are Efficient Video Learners | Aug 6, 2022 | Action ClassificationDecoder | CodeCode Available | 1 |
| Expanding Language-Image Pretrained Models for General Video Recognition | Aug 4, 2022 | Action ClassificationAction Recognition | CodeCode Available | 3 |
| Adaptive occlusion sensitivity analysis for visually explaining video recognition networks | Jul 26, 2022 | Decision Makingimage-classification | CodeCode Available | 0 |
| MAR: Masked Autoencoders for Efficient Action Recognition | Jul 24, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism | Jul 24, 2022 | ObjectObject State Change Classification | CodeCode Available | 0 |
| NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition | Jul 21, 2022 | Action RecognitionVideo Classification | —Unverified | 0 |
| Temporal Saliency Query Network for Efficient Video Recognition | Jul 21, 2022 | Action RecognitionVideo Recognition | —Unverified | 0 |
| Is an Object-Centric Video Representation Beneficial for Transfer? | Jul 20, 2022 | Action ClassificationObject | —Unverified | 0 |
| VidConv: A modernized 2D ConvNet for Efficient Video Recognition | Jul 8, 2022 | Action RecognitionVideo Recognition | CodeCode Available | 0 |
| EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2022: Team HNU-FPV Technical Report | Jul 7, 2022 | Action RecognitionDomain Adaptation | —Unverified | 0 |
| Revisiting Classifier: Transferring Vision-Language Models for Video Recognition | Jul 4, 2022 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| Exploring Temporally Dynamic Data Augmentation for Video Recognition | Jun 30, 2022 | Action LocalizationAction Segmentation | —Unverified | 0 |
| M&M Mix: A Multimodal Multiview Transformer Ensemble | Jun 20, 2022 | Action RecognitionVideo Recognition | —Unverified | 0 |
| MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing | Jun 13, 2022 | 3D ArchitectureAction Classification | CodeCode Available | 0 |
| Spatial-temporal Concept based Explanation of 3D ConvNets | Jun 9, 2022 | Action ClassificationVideo Recognition | CodeCode Available | 0 |
| AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition | May 26, 2022 | Action RecognitionVideo Recognition | CodeCode Available | 2 |
| Noise-Tolerant Learning for Audio-Visual Action Recognition | May 16, 2022 | Action RecognitionNoise Estimation | —Unverified | 0 |
| In Defense of Image Pre-Training for Spatiotemporal Recognition | May 3, 2022 | GPUSTS | CodeCode Available | 1 |
| Long Movie Clip Classification with State-Space Video Models | Apr 4, 2022 | ClassificationDecoder | CodeCode Available | 1 |
| Class-Incremental Learning for Action Recognition in Videos | Mar 25, 2022 | Action RecognitionAction Recognition In Videos | —Unverified | 0 |
| FAR: Fourier Aerial Video Recognition | Mar 21, 2022 | Action RecognitionActivity Recognition | CodeCode Available | 0 |
| Group Contextualization for Video Recognition | Mar 18, 2022 | Action RecognitionEgocentric Activity Recognition | CodeCode Available | 1 |
| Gate-Shift-Fuse for Video Action Recognition | Mar 16, 2022 | Action RecognitionTemporal Action Localization | CodeCode Available | 0 |
| Audio-Visual Fusion Layers for Event Type Aware Video Recognition | Feb 12, 2022 | Multi-Task LearningVideo Recognition | —Unverified | 0 |
| Should I take a walk? Estimating Energy Expenditure from Video Data | Feb 1, 2022 | Video Recognition | CodeCode Available | 0 |
| Fast Differentiable Matrix Square Root and Inverse Square Root | Jan 29, 2022 | Style TransferVideo Recognition | CodeCode Available | 1 |
| MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition | Jan 20, 2022 | Action AnticipationAction Classification | CodeCode Available | 1 |
| Action Keypoint Network for Efficient Video Recognition | Jan 17, 2022 | Action RecognitionPoint Cloud Classification | —Unverified | 0 |
| OCSampler: Compressing Videos to One Clip with Single-step Sampling | Jan 12, 2022 | GPUVideo Recognition | CodeCode Available | 1 |
| Condensing a Sequence to One Informative Frame for Video Recognition | Jan 11, 2022 | Motion Estimationvalid | —Unverified | 0 |
| Optimization Planning for 3D ConvNets | Jan 11, 2022 | Video Recognition | CodeCode Available | 0 |
| Glance and Focus Networks for Dynamic Visual Recognition | Jan 9, 2022 | image-classificationImage Classification | CodeCode Available | 1 |
| Recurring the Transformer for Video Action Recognition | Jan 1, 2022 | Action RecognitionGPU | —Unverified | 0 |
| Improving Video Model Transfer With Dynamic Representation Learning | Jan 1, 2022 | Action ClassificationKnowledge Distillation | —Unverified | 0 |
| AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition | Dec 28, 2021 | Computational EfficiencyDiversity | CodeCode Available | 1 |
| Cross-Modal Transferable Adversarial Attacks from Images to Videos | Dec 10, 2021 | Video Recognition | —Unverified | 0 |
| DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition | Dec 9, 2021 | Video Recognition | CodeCode Available | 1 |
| Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search | Dec 9, 2021 | Neural Architecture SearchVideo Recognition | —Unverified | 0 |
| MViTv2: Improved Multiscale Vision Transformers for Classification and Detection | Dec 2, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| TokenLearner: Adaptive Space-Time Tokenization for Videos | Dec 1, 2021 | Representation LearningVideo Recognition | CodeCode Available | 1 |
| Pooling by Sliced-Wasserstein Embedding | Dec 1, 2021 | Graph Learningimage-classification | CodeCode Available | 1 |
| Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning | Nov 30, 2021 | 3D Human Pose EstimationCamera Calibration | CodeCode Available | 1 |
| Efficient Video Transformers with Spatial-Temporal Token Selection | Nov 23, 2021 | Video Recognition | CodeCode Available | 1 |
| Attacking Video Recognition Models with Bullet-Screen Comments | Oct 29, 2021 | Adversarial AttackAdversarial Attack on Video Classification | CodeCode Available | 1 |
| ST-ABN: Visual Explanation Taking into Account Spatio-temporal Information for Video Recognition | Oct 29, 2021 | Decision MakingVideo Recognition | CodeCode Available | 0 |
| Temporal-attentive Covariance Pooling Networks for Video Recognition | Oct 27, 2021 | Video Recognition | CodeCode Available | 1 |