SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 751775 of 1149 papers

TitleStatusHype
Temporal Action Segmentation: An Analysis of Modern TechniquesCode2
How Would The Viewer Feel? Estimating Wellbeing From Video ScenariosCode0
Self-supervised video pretraining yields robust and more human-aligned visual representations0
Students taught by multimodal teachers are superior action recognizers0
EgoTaskQA: Understanding Human Tasks in Egocentric VideosCode1
Compressed Vision for Efficient Video Understanding0
SoccerNet 2022 Challenges ResultsCode1
Learning to Focus on the Foreground for Temporal Sentence Grounding0
In-the-Wild Video Question Answering0
Learning Transferable Spatiotemporal Representations from Natural Script KnowledgeCode1
Speeding Up Action Recognition Using Dynamic Accumulation of Residuals in Compressed Domain0
Streaming Video Temporal Action Segmentation In Real TimeCode1
AVT: Audio-Video Transformer for Multimodal Action Recognition0
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormerCode2
Panoramic Vision Transformer for Saliency Detection in 360° VideosCode1
WildQA: In-the-Wild Video Question Answering0
EchoCoTr: Estimation of the Left Ventricular Ejection Fraction from Spatiotemporal EchocardiographyCode1
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions0
Visual Subtitle Feature Enhanced Video Outline Generation0
Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding0
DeepSportradar-v1: Computer Vision Dataset for Sports Understanding with High Quality AnnotationsCode1
Motion Sensitive Contrastive Learning for Self-supervised Video Representation0
Exploring Anchor-based Detection for Ego4D Natural Language Query0
SA-NET.v2: Real-time vehicle detection from oblique UAV images with use of uncertainty estimation in deep meta-learning0
Two-Stream Transformer Architecture for Long Video Understanding0
Show:102550
← PrevPage 31 of 46Next →

No leaderboard results yet.