SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 971–980 of 1149 papers

Title	Date	Tasks	Status	Hype	Score
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning	Sep 30, 2024	Mixture-of-ExpertsOptical Character Recognition (OCR)	—Unverified	0	0
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding	Jun 20, 2024	FormVideo Understanding	—Unverified	0	0
MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning	May 28, 2024	Decision MakingVideo Understanding	—Unverified	0	0
MM-Ego: Towards Building Egocentric Multimodal LLMs	Oct 9, 2024	Video Understanding	—Unverified	0	0
Moment Quantization for Video Temporal Grounding	Apr 3, 2025	QuantizationVideo Understanding	—Unverified	0	0
MomentSeeker: A Task-Oriented Benchmark For Long-Video Moment Retrieval	Feb 18, 2025	Action RecognitionMoment Retrieval	—Unverified	0	0
Morph: Flexible Acceleration for 3D CNN-based Video Understanding	Oct 16, 2018	MORPHVideo Recognition	—Unverified	0	0
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models	Jan 6, 2025	BenchmarkingFeature Compression	—Unverified	0	0
Motion-Guided Masking for Spatiotemporal Representation Learning	Aug 24, 2023	Domain AdaptationRepresentation Learning	—Unverified	0	0
Motion Sensitive Contrastive Learning for Self-supervised Video Representation	Aug 12, 2022	Contrastive LearningRepresentation Learning	—Unverified	0	0

Show:10 25 50

← PrevPage 98 of 115Next →

No leaderboard results yet.