SOTAVerified

Action Classification

Papers

Showing 51100 of 457 papers

TitleStatusHype
AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose EstimationCode1
A Closer Look at Spatiotemporal Convolutions for Action RecognitionCode1
Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual RecognitionCode1
Quo Vadis, Action Recognition? A New Model and the Kinetics DatasetCode1
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action UnderstandingCode1
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video LearningCode1
AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked AutoencodersCode1
Autoregressive Adaptive Hypergraph Transformer for Skeleton-based Activity RecognitionCode1
AViD Dataset: Anonymized Videos from Diverse CountriesCode1
BABEL: Bodies, Action and Behavior with English LabelsCode1
Weakly-supervised Temporal Action Localization by Uncertainty ModelingCode1
Self-supervised Video TransformerCode1
An Image is Worth 16x16 Words, What is a Video Worth?Code1
Learning Spatiotemporal Features via Video and Text Pair DiscriminationCode1
Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action RecognitionCode1
Learning To Recognize Procedural Activities with Distant SupervisionCode1
MAR: Masked Autoencoders for Efficient Action RecognitionCode1
Mutual Modality Learning for Video Action ClassificationCode1
ActionCLIP: A New Paradigm for Video Action RecognitionCode1
Class-Difficulty Based Methods for Long-Tailed Visual RecognitionCode1
Just Add π! Pose Induced Video Transformers for Understanding Activities of Daily LivingCode1
ALIP: Adaptive Language-Image Pre-training with Synthetic CaptionCode1
CAST: Cross-Attention in Space and Time for Video Action RecognitionCode1
MViTv2: Improved Multiscale Vision Transformers for Classification and DetectionCode1
Can Deep Learning Recognize Subtle Human Activities?Code1
Alleviating Over-segmentation Errors by Detecting Action BoundariesCode1
DirecFormer: A Directed Attention in Transformer Approach to Robust Action RecognitionCode1
Infrared and 3D skeleton feature fusion for RGB-D action recognitionCode1
An Empirical Study of End-to-End Temporal Action DetectionCode1
Latent Embedding Feedback and Discriminative Features for Zero-Shot ClassificationCode1
CoCa: Contrastive Captioners are Image-Text Foundation ModelsCode1
An Evaluation of Action Recognition Models on EPIC-KitchensCode1
Keeping Your Eye on the Ball: Trajectory Attention in Video TransformersCode1
HierVL: Learning Hierarchical Video-Language EmbeddingsCode1
Let's Play for Action: Recognizing Activities of Daily Living by Learning from Life Simulation Video GamesCode1
Continual 3D Convolutional Neural Networks for Real-time Processing of VideosCode1
Masked Feature Prediction for Self-Supervised Visual Pre-TrainingCode1
ConvNet Architecture Search for Spatiotemporal Feature LearningCode1
Co-segmentation Inspired Attention Module for Video-based Computer Vision TasksCode1
Memory-augmented Dense Predictive Coding for Video Representation LearningCode1
MMNet: A Model-Based Multimodal Network for Human Action Recognition in RGB-D VideosCode1
CrossFi: A Cross Domain Wi-Fi Sensing Framework Based on Siamese NetworkCode1
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cuesCode1
MoViNets: Mobile Video Networks for Efficient Video RecognitionCode1
High Quality Monocular Depth Estimation via Transfer LearningCode1
Frame-wise Action Representations for Long Videos via Sequence Contrastive LearningCode1
Boundary-sensitive Pre-training for Temporal Localization in VideosCode1
OpenTAL: Towards Open Set Temporal Action LocalizationCode1
Dual-path Adaptation from Image to Video TransformersCode1
Frozen CLIP Models are Efficient Video LearnersCode1
Show:102550
← PrevPage 2 of 10Next →

No leaderboard results yet.