SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 276300 of 1149 papers

TitleStatusHype
Whether and When does Endoscopy Domain Pretraining Make Sense?Code1
Streaming Video ModelCode1
TimeBalance: Temporally-Invariant and Temporally-Distinctive Video Representations for Semi-Supervised Action RecognitionCode1
Weakly Supervised Video Representation Learning with Unaligned Text for Sequential VideosCode1
Dual-path Adaptation from Image to Video TransformersCode1
TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action LocalizationCode1
Localizing Moments in Long Video Via Multimodal GuidanceCode1
Test of Time: Instilling Video-Language Models with a Sense of TimeCode1
Boosting Single Image Super-Resolution via Partial Channel ShiftingCode1
Modeling Video As Stochastic Processes for Fine-Grained Video Representation LearningCode1
Towards Smooth Video CompositionCode1
MOMA-LRG: Language-Refined Graphs for Multi-Object Multi-Actor Activity ParsingCode1
Contrastive Masked Autoencoders for Self-Supervised Video HashingCode1
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal TokensCode1
InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D ChallengesCode1
VTC: Improving Video-Text Retrieval with User CommentsCode1
EgoTaskQA: Understanding Human Tasks in Egocentric VideosCode1
SoccerNet 2022 Challenges ResultsCode1
Learning Transferable Spatiotemporal Representations from Natural Script KnowledgeCode1
Streaming Video Temporal Action Segmentation In Real TimeCode1
Panoramic Vision Transformer for Saliency Detection in 360° VideosCode1
EchoCoTr: Estimation of the Left Ventricular Ejection Fraction from Spatiotemporal EchocardiographyCode1
DeepSportradar-v1: Computer Vision Dataset for Sports Understanding with High Quality AnnotationsCode1
Point Primitive Transformer for Long-Term 4D Point Cloud Video UnderstandingCode1
Static and Dynamic Concepts for Self-supervised Video Representation LearningCode1
Show:102550
← PrevPage 12 of 46Next →

No leaderboard results yet.