SOTAVerified

Action Classification

Papers

Showing 101150 of 457 papers

TitleStatusHype
Open-Vocabulary Video Relation ExtractionCode1
TSM: Temporal Shift Module for Efficient Video UnderstandingCode1
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video RecognitionCode1
DirecFormer: A Directed Attention in Transformer Approach to Robust Action RecognitionCode1
A Closer Look at Spatiotemporal Convolutions for Action RecognitionCode1
Timeception for Complex Action RecognitionCode1
Dissected 3D CNNs: Temporal Skip Connections for Efficient Online Video ProcessingCode1
Masked Feature Prediction for Self-Supervised Visual Pre-TrainingCode1
A Simple and Efficient Pipeline to Build an End-to-End Spatial-Temporal Action DetectorCode1
UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation LearningCode1
Let's Play for Action: Recognizing Activities of Daily Living by Learning from Life Simulation Video GamesCode1
Boundary-sensitive Pre-training for Temporal Localization in VideosCode1
MAR: Masked Autoencoders for Efficient Action RecognitionCode1
Video Contrastive Learning with Global ContextCode1
MMNet: A Model-Based Multimodal Network for Human Action Recognition in RGB-D VideosCode1
Latent Embedding Feedback and Discriminative Features for Zero-Shot ClassificationCode1
Autoregressive Adaptive Hypergraph Transformer for Skeleton-based Activity RecognitionCode1
Dual-path Adaptation from Image to Video TransformersCode1
Keeping Your Eye on the Ball: Trajectory Attention in Video TransformersCode1
VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily LivingCode1
BABEL: Bodies, Action and Behavior with English LabelsCode1
What Can Simple Arithmetic Operations Do for Temporal Modeling?Code1
Weakly-supervised Temporal Action Localization by Uncertainty ModelingCode1
XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation LearningCode1
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action UnderstandingCode1
ALIP: Adaptive Language-Image Pre-training with Synthetic CaptionCode1
Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action RecognitionCode1
Enriching Local and Global Contexts for Temporal Action LocalizationCode1
HierVL: Learning Hierarchical Video-Language EmbeddingsCode1
ViViT: A Video Vision TransformerCode1
Just Add π! Pose Induced Video Transformers for Understanding Activities of Daily LivingCode1
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action RecognitionCode1
KNN-MMD: Cross Domain Wireless Sensing via Local Distribution AlignmentCode1
Learning Spatiotemporal Features via Video and Text Pair DiscriminationCode1
MotionSqueeze: Neural Motion Feature Learning for Video UnderstandingCode1
Learning To Recognize Procedural Activities with Distant SupervisionCode1
ST-Adapter: Parameter-Efficient Image-to-Video Transfer LearningCode1
Make Your Training Flexible: Towards Deployment-Efficient Video ModelsCode1
Finding the Missing Data: A BERT-inspired Approach Against Package Loss in Wireless SensingCode1
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cuesCode1
Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation LearningCode1
Memory-augmented Dense Predictive Coding for Video Representation LearningCode1
SoccerNet: A Scalable Dataset for Action Spotting in Soccer VideosCode1
End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding0
Adaptive Intermediate Representations for Video Understanding0
Egocentric Audio-Visual Noise Suppression0
ActionBytes: Learning From Trimmed Videos to Localize Actions0
IMUVIE: Pickup Timeline Action Localization via Motion Movies0
Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification0
Efficient Optimization for Average Precision SVM0
Show:102550
← PrevPage 3 of 10Next →

No leaderboard results yet.