SOTAVerified

Action Classification

Papers

Showing 51100 of 457 papers

TitleStatusHype
AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose EstimationCode1
A Closer Look at Spatiotemporal Convolutions for Action RecognitionCode1
Multiscale Vision TransformersCode1
Representation Learning via Global Temporal Alignment and Cycle-ConsistencyCode1
MotionSqueeze: Neural Motion Feature Learning for Video UnderstandingCode1
Just Add π! Pose Induced Video Transformers for Understanding Activities of Daily LivingCode1
AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked AutoencodersCode1
Autoregressive Adaptive Hypergraph Transformer for Skeleton-based Activity RecognitionCode1
AViD Dataset: Anonymized Videos from Diverse CountriesCode1
BABEL: Bodies, Action and Behavior with English LabelsCode1
Weakly-supervised Temporal Action Localization by Uncertainty ModelingCode1
KNN-MMD: Cross Domain Wireless Sensing via Local Distribution AlignmentCode1
MMNet: A Model-Based Multimodal Network for Human Action Recognition in RGB-D VideosCode1
MoViNets: Mobile Video Networks for Efficient Video RecognitionCode1
Mutual Modality Learning for Video Action ClassificationCode1
Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action LocalizationCode1
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action RecognitionCode1
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action UnderstandingCode1
Memory-augmented Dense Predictive Coding for Video Representation LearningCode1
ALIP: Adaptive Language-Image Pre-training with Synthetic CaptionCode1
CAST: Cross-Attention in Space and Time for Video Action RecognitionCode1
Masked Feature Prediction for Self-Supervised Visual Pre-TrainingCode1
Can Deep Learning Recognize Subtle Human Activities?Code1
MAR: Masked Autoencoders for Efficient Action RecognitionCode1
Class-Difficulty Based Methods for Long-Tailed Visual RecognitionCode1
Alleviating Over-segmentation Errors by Detecting Action BoundariesCode1
ActionCLIP: A New Paradigm for Video Action RecognitionCode1
Enriching Local and Global Contexts for Temporal Action LocalizationCode1
An Empirical Study of End-to-End Temporal Action DetectionCode1
Frame-wise Action Representations for Long Videos via Sequence Contrastive LearningCode1
CoCa: Contrastive Captioners are Image-Text Foundation ModelsCode1
An Evaluation of Action Recognition Models on EPIC-KitchensCode1
Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation LearningCode1
Florence: A New Foundation Model for Computer VisionCode1
Finding the Missing Data: A BERT-inspired Approach Against Package Loss in Wireless SensingCode1
Continual 3D Convolutional Neural Networks for Real-time Processing of VideosCode1
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video RecognitionCode1
ConvNet Architecture Search for Spatiotemporal Feature LearningCode1
Dual-path Adaptation from Image to Video TransformersCode1
Frozen CLIP Models are Efficient Video LearnersCode1
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cuesCode1
CrossFi: A Cross Domain Wi-Fi Sensing Framework Based on Siamese NetworkCode1
CT-Net: Channel Tensorization Network for Video ClassificationCode1
Open-Vocabulary Video Relation ExtractionCode1
MViTv2: Improved Multiscale Vision Transformers for Classification and DetectionCode1
Let's Play for Action: Recognizing Activities of Daily Living by Learning from Life Simulation Video GamesCode1
HierVL: Learning Hierarchical Video-Language EmbeddingsCode1
High Quality Monocular Depth Estimation via Transfer LearningCode1
Implicit Temporal Modeling with Learnable Alignment for Video RecognitionCode1
Learning To Recognize Procedural Activities with Distant SupervisionCode1
Show:102550
← PrevPage 2 of 10Next →

No leaderboard results yet.