SOTAVerified

Action Classification

Papers

Showing 101150 of 457 papers

TitleStatusHype
Revisiting spatio-temporal layouts for compositional action recognitionCode1
UNIK: A Unified Framework for Real-world Skeleton-based Action RecognitionCode1
Infrared and 3D skeleton feature fusion for RGB-D action recognitionCode1
DirecFormer: A Directed Attention in Transformer Approach to Robust Action RecognitionCode1
roadscene2vec: A Tool for Extracting and Embedding Road Scene-GraphsCode1
MoViNets: Mobile Video Networks for Efficient Video RecognitionCode1
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video LearningCode1
Busy-Quiet Video Disentangling for Video ClassificationCode1
Boundary-sensitive Pre-training for Temporal Localization in VideosCode1
Revisiting ResNets: Improved Training and Scaling StrategiesCode1
Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision TransformersCode1
MVFNet: Multi-View Fusion Network for Efficient Video RecognitionCode1
Non-local Neural NetworksCode1
Visual Semantic Role LabelingCode1
ReAct: Temporal Action Detection with Relational QueriesCode1
What and How Well You Performed? A Multitask Learning Approach to Action Quality AssessmentCode1
Autoregressive Adaptive Hypergraph Transformer for Skeleton-based Activity RecognitionCode1
Dual-path Adaptation from Image to Video TransformersCode1
AViD Dataset: Anonymized Videos from Diverse CountriesCode1
Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual RecognitionCode1
BABEL: Bodies, Action and Behavior with English LabelsCode1
Region-based Non-local Operation for Video ClassificationCode1
Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation LearningCode1
ST-Adapter: Parameter-Efficient Image-to-Video Transfer LearningCode1
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action UnderstandingCode1
Over-the-Air Adversarial Flickering Attacks against Video Recognition NetworksCode1
Enriching Local and Global Contexts for Temporal Action LocalizationCode1
Delving Deep into One-Shot Skeleton-based Action Recognition with Diverse OcclusionsCode1
ViViT: A Video Vision TransformerCode1
Latent Embedding Feedback and Discriminative Features for Zero-Shot ClassificationCode1
Proposal Relation Network for Temporal Action DetectionCode1
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action RecognitionCode1
Quo Vadis, Action Recognition? A New Model and the Kinetics DatasetCode1
Representation Learning via Global Temporal Alignment and Cycle-ConsistencyCode1
Self-supervised Video TransformerCode1
Large Scale Holistic Video UnderstandingCode1
High Quality Monocular Depth Estimation via Transfer LearningCode1
HierVL: Learning Hierarchical Video-Language EmbeddingsCode1
Finding the Missing Data: A BERT-inspired Approach Against Package Loss in Wireless SensingCode1
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cuesCode1
MViTv2: Improved Multiscale Vision Transformers for Classification and DetectionCode1
Implicit Temporal Modeling with Learnable Alignment for Video RecognitionCode1
Stand-Alone Inter-Frame Attention in Video ModelsCode1
Back to the Future: Cycle Encoding Prediction for Self-supervised Contrastive Video Representation LearningCode0
Do we really need temporal convolutions in action segmentation?Code0
Person Segmentation and Action Classification for Multi-Channel Hemisphere Field of View LiDAR SensorsCode0
ECO: Efficient Convolutional Network for Online Video UnderstandingCode0
Pose And Joint-Aware Action RecognitionCode0
Object Priors for Classifying and Localizing Unseen ActionsCode0
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave ConvolutionCode0
Show:102550
← PrevPage 3 of 10Next →

No leaderboard results yet.