SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 211–220 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation	Dec 16, 2021	Contrastive LearningRepresentation Learning	CodeCode Available	1	5
Contrastive Masked Autoencoders for Self-Supervised Video Hashing	Nov 21, 2022	DecoderRetrieval	CodeCode Available	1	5
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions	May 16, 2021	Action DetectionAction Localization	CodeCode Available	1	5
From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding	Sep 27, 2024	Video UnderstandingVisual Reasoning	CodeCode Available	1	5
AlanaVLM: A Multimodal Embodied AI Foundation Model for Egocentric Video Understanding	Jun 19, 2024	Question AnsweringSpatial Reasoning	CodeCode Available	1	5
From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering	May 30, 2022	counterfactualDescriptive	CodeCode Available	1	5
AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation	Jan 14, 2025	MambaVideo Understanding	CodeCode Available	1	5
Disentangle Your Dense Object Detector	Jul 7, 2021	DisentanglementObject	CodeCode Available	1	5
From My View to Yours: Ego-Augmented Learning in Large Vision Language Models for Understanding Exocentric Daily Living Activities	Jan 10, 2025	Human-Object Interaction DetectionKnowledge Distillation	CodeCode Available	1	5
Mug-STAN: Adapting Image-Language Pretrained Models for General Video Understanding	Nov 25, 2023	Video Understanding	CodeCode Available	1	5

Show:10 25 50

← PrevPage 22 of 115Next →

No leaderboard results yet.