| GTM: Gray Temporal Model for Video Recognition | Oct 20, 2021 | Action Recognitionmodel | —Unverified | 0 |
| Boosting the Transferability of Video Adversarial Examples via Temporal Translation | Oct 18, 2021 | Adversarial AttackTranslation | CodeCode Available | 1 |
| TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device | Sep 27, 2021 | Video RecognitionVideo Understanding | CodeCode Available | 2 |
| QTTNet: Quantized Tensor Train Neural Networks for 3D Object and Video Recognition. | Sep 20, 2021 | QuantizationVideo Recognition | CodeCode Available | 0 |
| Unsupervised 3D Pose Estimation for Hierarchical Dance Video Recognition | Sep 19, 2021 | 3D Pose EstimationPose Estimation | CodeCode Available | 1 |
| Large-vocabulary Audio-visual Speech Recognition in Noisy Environments | Sep 10, 2021 | Audio-Visual Speech RecognitionLipreading | —Unverified | 0 |
| Revisiting 3D ResNets for Video Recognition | Sep 3, 2021 | Action ClassificationContrastive Learning | CodeCode Available | 0 |
| Towards Learning a Vocabulary of Visual Concepts and Operators using Deep Neural Networks | Sep 1, 2021 | Video Recognition | —Unverified | 0 |
| Searching for Two-Stream Models in Multivariate Space for Video Recognition | Aug 30, 2021 | Neural Architecture SearchVideo Recognition | —Unverified | 0 |
| Reinforcement Learning Based Sparse Black-box Adversarial Attack on Video Recognition Models | Aug 29, 2021 | Adversarial Attackreinforcement-learning | —Unverified | 0 |
| Dynamic Network Quantization for Efficient Video Inference | Aug 23, 2021 | QuantizationVideo Recognition | CodeCode Available | 1 |
| Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework | Jul 26, 2021 | image-classificationImage Classification | —Unverified | 0 |
| Inter-intra Variant Dual Representations forSelf-supervised Video Recognition | Jul 2, 2021 | Contrastive LearningRepresentation Learning | CodeCode Available | 0 |
| Can An Image Classifier Suffice For Action Recognition? | Jun 26, 2021 | Action Recognitionimage-classification | CodeCode Available | 1 |
| Video Swin Transformer | Jun 24, 2021 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? | Jun 21, 2021 | Action ClassificationImage Classification | CodeCode Available | 1 |
| Towards Long-Form Video Understanding | Jun 21, 2021 | Action RecognitionForm | CodeCode Available | 1 |
| Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting | Jun 18, 2021 | Action RecognitionAction Recognition In Videos | CodeCode Available | 1 |
| PyKale: Knowledge-Aware Machine Learning from Multiple Sources in Python | Jun 17, 2021 | BIG-bench Machine LearningDimensionality Reduction | CodeCode Available | 1 |
| VidHarm: A Clip Based Dataset for Harmful Content Detection | Jun 15, 2021 | Video Recognition | —Unverified | 0 |
| Space-time Mixing Attention for Video Transformer | Jun 10, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Continual 3D Convolutional Neural Networks for Real-time Processing of Videos | May 31, 2021 | Action ClassificationVideo Recognition | CodeCode Available | 1 |
| DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning | May 25, 2021 | Action RecognitionLong-range modeling | CodeCode Available | 1 |
| Sharing Pain: Using Pain Domain Transfer for Video Recognition of Low Grade Orthopedic Pain in Horses | May 21, 2021 | Action RecognitionFine-grained Action Recognition | CodeCode Available | 1 |
| AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition | May 11, 2021 | Video Recognition | CodeCode Available | 1 |
| Adaptive Focus for Efficient Video Recognition | May 7, 2021 | Computational EfficiencyGPU | CodeCode Available | 1 |
| VideoLT: Large-scale Long-tailed Video Recognition | May 6, 2021 | image-classificationImage Classification | CodeCode Available | 1 |
| Motion-Augmented Self-Training for Video Recognition at Smaller Scale | May 4, 2021 | Action RecognitionOptical Flow Estimation | —Unverified | 0 |
| FrameExit: Conditional Early Exiting for Efficient Video Recognition | Apr 27, 2021 | Video RecognitionVideo Understanding | CodeCode Available | 1 |
| The Influence of Audio on Video Memorability with an Audio Gestalt Regulated Video Memorability System | Apr 23, 2021 | Multimodal Deep LearningVideo Recognition | —Unverified | 0 |
| Multiscale Vision Transformers | Apr 22, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| HCMS: Hierarchical and Conditional Modality Selection for Efficient Video Recognition | Apr 20, 2021 | Video Recognition | —Unverified | 0 |
| Towards Extremely Compact RNNs for Video Recognition with Fully Decomposed Hierarchical Tucker Structure | Apr 12, 2021 | Tensor DecompositionVideo Recognition | —Unverified | 0 |
| On the Pitfalls of Learning with Limited Data: A Facial Expression Recognition Case Study | Apr 2, 2021 | Data AugmentationDeep Learning | —Unverified | 0 |
| Visual Semantic Role Labeling for Video Understanding | Apr 2, 2021 | Semantic Role LabelingVideo Recognition | CodeCode Available | 1 |
| Multiview Pseudo-Labeling for Semi-supervised Learning from Video | Apr 1, 2021 | Representation LearningVideo Recognition | —Unverified | 0 |
| Recognizing Actions in Videos from Unseen Viewpoints | Mar 30, 2021 | Action ClassificationAction Recognition | —Unverified | 0 |
| Learning Versatile Neural Architectures by Propagating Network Codes | Mar 24, 2021 | Image SegmentationNeural Architecture Search | CodeCode Available | 1 |
| MoViNets: Mobile Video Networks for Efficient Video Recognition | Mar 21, 2021 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| PatchNet -- Short-range Template Matching for Efficient Video Processing | Mar 10, 2021 | Objectobject-detection | CodeCode Available | 1 |
| Video Transformer Network | Feb 1, 2021 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| Piano Skills Assessment | Jan 13, 2021 | Action Quality AssessmentAudio Classification | CodeCode Available | 1 |
| Multi-Modal Multi-Action Video Recognition | Jan 1, 2021 | RelationVideo Recognition | CodeCode Available | 0 |
| Interactive Prototype Learning for Egocentric Action Recognition | Jan 1, 2021 | Action RecognitionObject | —Unverified | 0 |
| 2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition | Dec 29, 2020 | Action RecognitionPolicy Gradient Methods | —Unverified | 0 |
| DeepGamble: Towards unlocking real-time player intelligence using multi-layer instance segmentation and attribute detection | Dec 14, 2020 | AttributeInstance Segmentation | —Unverified | 0 |
| MVFNet: Multi-View Fusion Network for Efficient Video Recognition | Dec 13, 2020 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Overcomplete Representations Against Adversarial Videos | Dec 8, 2020 | Adversarial RobustnessDecoder | CodeCode Available | 0 |
| Learning Equivariant Representations | Dec 4, 2020 | 3D Shape ClassificationGeneral Classification | CodeCode Available | 1 |
| Open-Ended Multi-Modal Relational Reasoning for Video Question Answering | Dec 1, 2020 | Question AnsweringRelational Reasoning | CodeCode Available | 0 |