| Audio-Visual Glance Network for Efficient Video Recognition | Aug 18, 2023 | Video RecognitionVideo Understanding | —Unverified | 0 |
| Orthogonal Temporal Interpolation for Zero-Shot Video Recognition | Aug 14, 2023 | Video RecognitionZero-Shot Action Recognition | CodeCode Available | 0 |
| On the Importance of Spatial Relations for Few-shot Action Recognition | Aug 14, 2023 | Action RecognitionFew-Shot action recognition | —Unverified | 0 |
| View while Moving: Efficient Video Recognition in Long-untrimmed Videos | Aug 9, 2023 | Video Recognition | —Unverified | 0 |
| TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter | Jun 22, 2023 | Question AnsweringRetrieval | CodeCode Available | 0 |
| Enhanced Multimodal Representation Learning with Cross-modal KD | Jun 13, 2023 | Contrastive LearningEmotion Classification | —Unverified | 0 |
| A two-way translation system of Chinese sign language based on computer vision | Jun 3, 2023 | SentenceSign Language Recognition | —Unverified | 0 |
| Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles | Jun 1, 2023 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| Spatiotemporal Attention-based Semantic Compression for Real-time Video Recognition | May 22, 2023 | Action RecognitionDecoder | —Unverified | 0 |
| Inter-frame Accelerate Attack against Video Interpolation Models | May 11, 2023 | Adversarial RobustnessVideo Frame Interpolation | —Unverified | 0 |
| Multi-object Video Generation from Single Frame Layouts | May 6, 2023 | Image GenerationObject | —Unverified | 0 |
| Use Your Head: Improving Long-Tail Video Recognition | Apr 3, 2023 | Video Recognition | CodeCode Available | 0 |
| Efficient Decision-based Black-box Patch Attacks on Video Recognition | Mar 21, 2023 | Video Recognition | —Unverified | 0 |
| Video Action Recognition with Attentive Semantic Units | Mar 17, 2023 | Action RecognitionDecoder | —Unverified | 0 |
| MRET: Multi-resolution Transformer for Video Quality Assessment | Mar 13, 2023 | Video Quality AssessmentVideo Recognition | —Unverified | 0 |
| Maximizing Spatio-Temporal Entropy of Deep 3D CNNs for Efficient Video Recognition | Mar 5, 2023 | Action RecognitionComputational Efficiency | CodeCode Available | 0 |
| Video4MRI: An Empirical Study on Brain Magnetic Resonance Image Analytics with CNN-based Video Classification Frameworks | Feb 24, 2023 | ClassificationData Augmentation | —Unverified | 0 |
| Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus on Videos | Jan 3, 2023 | Action RecognitionAdversarial Robustness | CodeCode Available | 0 |
| Tiny Updater: Towards Efficient Neural Network-Driven Software Updating | Jan 1, 2023 | Efficient Neural Networkimage-classification | CodeCode Available | 0 |
| Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for Video Recognition with Hierarchical Tucker Tensor Decomposition | Dec 5, 2022 | Tensor DecompositionVideo Recognition | —Unverified | 0 |
| Temporal superimposed crossover module for effective continuous sign language | Nov 7, 2022 | image-classificationImage Classification | CodeCode Available | 0 |
| REST: REtrieve & Self-Train for generative action recognition | Sep 29, 2022 | Action RecognitionCaption Generation | —Unverified | 0 |
| On the Surprising Effectiveness of Transformers in Low-Labeled Video Recognition | Sep 15, 2022 | image-classificationImage Classification | —Unverified | 0 |
| Video Mobile-Former: Video Recognition with Efficient Global Spatial-temporal Modeling | Aug 25, 2022 | Video Recognition | —Unverified | 0 |
| Efficient Attention-free Video Shift Transformers | Aug 23, 2022 | Action RecognitionVideo Recognition | —Unverified | 0 |
| Adaptive occlusion sensitivity analysis for visually explaining video recognition networks | Jul 26, 2022 | Decision Makingimage-classification | CodeCode Available | 0 |
| Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism | Jul 24, 2022 | ObjectObject State Change Classification | CodeCode Available | 0 |
| NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition | Jul 21, 2022 | Action RecognitionVideo Classification | —Unverified | 0 |
| Temporal Saliency Query Network for Efficient Video Recognition | Jul 21, 2022 | Action RecognitionVideo Recognition | —Unverified | 0 |
| Is an Object-Centric Video Representation Beneficial for Transfer? | Jul 20, 2022 | Action ClassificationObject | —Unverified | 0 |
| VidConv: A modernized 2D ConvNet for Efficient Video Recognition | Jul 8, 2022 | Action RecognitionVideo Recognition | CodeCode Available | 0 |
| EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2022: Team HNU-FPV Technical Report | Jul 7, 2022 | Action RecognitionDomain Adaptation | —Unverified | 0 |
| Exploring Temporally Dynamic Data Augmentation for Video Recognition | Jun 30, 2022 | Action LocalizationAction Segmentation | —Unverified | 0 |
| M&M Mix: A Multimodal Multiview Transformer Ensemble | Jun 20, 2022 | Action RecognitionVideo Recognition | —Unverified | 0 |
| MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing | Jun 13, 2022 | 3D ArchitectureAction Classification | CodeCode Available | 0 |
| Spatial-temporal Concept based Explanation of 3D ConvNets | Jun 9, 2022 | Action ClassificationVideo Recognition | CodeCode Available | 0 |
| Noise-Tolerant Learning for Audio-Visual Action Recognition | May 16, 2022 | Action RecognitionNoise Estimation | —Unverified | 0 |
| Class-Incremental Learning for Action Recognition in Videos | Mar 25, 2022 | Action RecognitionAction Recognition In Videos | —Unverified | 0 |
| FAR: Fourier Aerial Video Recognition | Mar 21, 2022 | Action RecognitionActivity Recognition | CodeCode Available | 0 |
| Gate-Shift-Fuse for Video Action Recognition | Mar 16, 2022 | Action RecognitionTemporal Action Localization | CodeCode Available | 0 |
| Audio-Visual Fusion Layers for Event Type Aware Video Recognition | Feb 12, 2022 | Multi-Task LearningVideo Recognition | —Unverified | 0 |
| Should I take a walk? Estimating Energy Expenditure from Video Data | Feb 1, 2022 | Video Recognition | CodeCode Available | 0 |
| Action Keypoint Network for Efficient Video Recognition | Jan 17, 2022 | Action RecognitionPoint Cloud Classification | —Unverified | 0 |
| Condensing a Sequence to One Informative Frame for Video Recognition | Jan 11, 2022 | Motion Estimationvalid | —Unverified | 0 |
| Optimization Planning for 3D ConvNets | Jan 11, 2022 | Video Recognition | CodeCode Available | 0 |
| Improving Video Model Transfer With Dynamic Representation Learning | Jan 1, 2022 | Action ClassificationKnowledge Distillation | —Unverified | 0 |
| Recurring the Transformer for Video Action Recognition | Jan 1, 2022 | Action RecognitionGPU | —Unverified | 0 |
| Cross-Modal Transferable Adversarial Attacks from Images to Videos | Dec 10, 2021 | Video Recognition | —Unverified | 0 |
| Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search | Dec 9, 2021 | Neural Architecture SearchVideo Recognition | —Unverified | 0 |
| ST-ABN: Visual Explanation Taking into Account Spatio-temporal Information for Video Recognition | Oct 29, 2021 | Decision MakingVideo Recognition | CodeCode Available | 0 |