SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 821830 of 1149 papers

TitleStatusHype
Temporally-Adaptive Models for Efficient Video Understanding0
M^3Net: Multi-view Encoding, Matching, and Fusion for Few-shot Fine-grained Action Recognition0
DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation0
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation0
HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge UnderstandingCode0
VideoGLUE: Video General Understanding Evaluation of Foundation Models0
ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval ModelsCode0
Temporal Action Proposal Generation With Action Frequency Adaptive NetworkCode0
Learning Space-Time Semantic Correspondences0
Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment0
Show:102550
← PrevPage 83 of 115Next →

No leaderboard results yet.