SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 511–520 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy Labels	Oct 13, 2021	Action ClassificationSelf-Supervised Learning	CodeCode Available	0	5
MOFO: MOtion FOcused Self-Supervision for Video Understanding	Aug 23, 2023	Action ClassificationAction Recognition	CodeCode Available	0	5
MOD: A Deep Mixture Model with Online Knowledge Distillation for Large Scale Video Temporal Concept Localization	Oct 27, 2019	Knowledge DistillationVideo Understanding	CodeCode Available	0	5
End-to-End Learning of Motion Representation for Video Understanding	Apr 2, 2018	Action RecognitionOptical Flow Estimation	CodeCode Available	0	5
MINOTAUR: Multi-task Video Grounding From Multimodal Queries	Feb 16, 2023	Action DetectionSentence	CodeCode Available	0	5
METok: Multi-Stage Event-based Token Compression for Efficient Long Video Understanding	Jun 3, 2025	Video Understanding	CodeCode Available	0	5
B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens	Dec 13, 2024	Language ModelingLanguage Modelling	CodeCode Available	0	5
EgoVLM: Policy Optimization for Egocentric Video Understanding	Jun 3, 2025	EgoSchemaQuestion Answering	CodeCode Available	0	5
Masked Autoencoders for Egocentric Video Understanding @ Ego4D Challenge 2022	Nov 18, 2022	Object State Change ClassificationTemporal Localization	CodeCode Available	0	5
Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision	Jun 6, 2025	Video Understanding	CodeCode Available	0	5

Show:10 25 50

← PrevPage 52 of 115Next →

No leaderboard results yet.