SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 911920 of 1149 papers

TitleStatusHype
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive LearningCode1
NExT-QA: Next Phase of Question-Answering to Explaining Temporal ActionsCode1
Learning the Predictability of the FutureCode1
Discerning Generic Event Boundaries in Long-Form Wild Videos0
End-to-end Temporal Action Detection with TransformerCode1
Long-Short Temporal Contrastive Learning of Video Transformers0
C^3: Compositional Counterfactual Contrastive Learning for Video-grounded Dialogues0
Isolated Sign Recognition from RGB Video using Pose Flow and Self-AttentionCode1
VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and SummarizationCode1
Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition0
Show:102550
← PrevPage 92 of 115Next →

No leaderboard results yet.