SOTAVerified

Action Classification

Papers

Showing 110 of 457 papers

TitleStatusHype
InternVideo2: Scaling Foundation Models for Multimodal Video UnderstandingCode7
VideoMamba: State Space Model for Efficient Video UnderstandingCode5
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and VideoCode4
InternVideo: General Video Foundation Models via Generative and Discriminative LearningCode4
Towards Universal Soccer Video UnderstandingCode3
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-TrainingCode3
Expanding Language-Image Pretrained Models for General Video RecognitionCode3
ONE-PEACE: Exploring One General Representation Model Toward Unlimited ModalitiesCode3
MARLIN: Masked Autoencoder for facial video Representation LearnINgCode2
Learning Video Representations from Large Language ModelsCode2
Show:102550
← PrevPage 1 of 46Next →

No leaderboard results yet.