SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 661670 of 1149 papers

TitleStatusHype
Helping Hands: An Object-Aware Ego-Centric Video Recognition ModelCode1
Temporally-Adaptive Models for Efficient Video Understanding0
M^3Net: Multi-view Encoding, Matching, and Fusion for Few-shot Fine-grained Action Recognition0
MovieChat: From Dense Token to Sparse Memory for Long Video UnderstandingCode2
DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation0
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and FutureCode2
Multimodal Distillation for Egocentric Action RecognitionCode1
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation0
HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge UnderstandingCode0
Self-Adaptive Sampling for Efficient Video Question-Answering on Image--Text ModelsCode1
Show:102550
← PrevPage 67 of 115Next →

No leaderboard results yet.