SOTAVerified

Action Classification

Papers

Showing 201250 of 457 papers

TitleStatusHype
UNIK: A Unified Framework for Real-world Skeleton-based Action RecognitionCode1
Let's Play for Action: Recognizing Activities of Daily Living by Learning from Life Simulation Video GamesCode1
Attention Bottlenecks for Multimodal FusionCode0
Video Swin TransformerCode2
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive LearningCode1
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?Code1
TNT: Text-Conditioned Network with Transductive Inference for Few-Shot Video ClassificationCode0
Proposal Relation Network for Temporal Action DetectionCode1
BABEL: Bodies, Action and Behavior with English LabelsCode1
Space-time Mixing Attention for Video TransformerCode1
Keeping Your Eye on the Ball: Trajectory Attention in Video TransformersCode1
CT-Net: Channel Tensorization Network for Video ClassificationCode1
Continual 3D Convolutional Neural Networks for Real-time Processing of VideosCode1
Distributed Learning with Strategic Users: A Repeated Game Approach0
VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily LivingCode1
Representation Learning via Global Temporal Alignment and Cycle-ConsistencyCode1
Unsupervised Visual Representation Learning by Tracking Patches in VideoCode1
VidTr: Video Transformer Without Convolutions0
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and TextCode1
Multiscale Vision TransformersCode1
Temporal Query Networks for Fine-grained Video Understanding0
Adaptive Intermediate Representations for Video Understanding0
Object Priors for Classifying and Localizing Unseen ActionsCode0
Zeus: Efficiently Localizing Actions in Videos using Reinforcement Learning0
TubeR: Tubelet Transformer for Video Action DetectionCode1
Contrastive Learning of Single-Cell Phenotypic Representations for Treatment Classification0
Recognizing Actions in Videos from Unseen Viewpoints0
Busy-Quiet Video Disentangling for Video ClassificationCode1
ViViT: A Video Vision TransformerCode1
Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action Localization0
An Image is Worth 16x16 Words, What is a Video Worth?Code1
MoViNets: Mobile Video Networks for Efficient Video RecognitionCode1
Revisiting ResNets: Improved Training and Scaling StrategiesCode1
Domain and View-point Agnostic Hand Action RecognitionCode0
Is Space-Time Attention All You Need for Video Understanding?Code2
Video Transformer NetworkCode0
TCLR: Temporal Contrastive Learning for Video RepresentationCode1
Human Action Recognition Based on Multi-scale Feature Maps from Depth Video Sequences0
Watch Only Once: An End-to-End Video Action Detection Framework0
TDN: Temporal Difference Networks for Efficient Action RecognitionCode1
Weakly-Supervised Action Localization and Action Recognition using Global-Local Attention of 3D CNN0
MVFNet: Multi-View Fusion Network for Efficient Video RecognitionCode1
Hierarchical Human Action Classification with Network Pruning0
Real-time Spatio-temporal Action Localization via Learning Motion Representation0
Depth-Aware Action Recognition: Pose-Motion Encoding through Temporal Heatmaps0
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization TasksCode1
Boundary-sensitive Pre-training for Temporal Localization in VideosCode1
3D attention mechanism for fine-grained classification of table tennis strokes using a Twin Spatio-Temporal Convolutional Neural Networks0
Improved Soccer Action Spotting using both Audio and Video Streams0
Mutual Modality Learning for Video Action ClassificationCode1
Show:102550
← PrevPage 5 of 10Next →

No leaderboard results yet.