SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 561570 of 1149 papers

TitleStatusHype
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning0
Grounding-MD: Grounded Video-language Pre-training for Open-World Moment Detection0
Cultivating DNN Diversity for Large Scale Video Labelling0
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation0
Grounding Action Descriptions in Videos0
Grounded Video Situation Recognition0
CTM: Collaborative Temporal Modeling for Action Recognition0
CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding0
Audio-visual training for improved grounding in video-text LLMs0
Motion Sensitive Contrastive Learning for Self-supervised Video Representation0
Show:102550
← PrevPage 57 of 115Next →

No leaderboard results yet.