SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 481–490 of 1149 papers

Title	Date	Tasks	Status	Hype
PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild	Apr 15, 2025	SegmentationSemantic Segmentation	—Unverified	0
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding	Apr 15, 2025	Semantic SegmentationVideo Generation	—Unverified	0
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model	Apr 14, 2025	Computational EfficiencyLanguage Modeling	—Unverified	0
Towards Efficient and Robust Moment Retrieval System: A Unified Framework for Multi-Granularity Models and Temporal Reranking	Apr 11, 2025	Moment RetrievalQuestion Answering	—Unverified	0
SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding	Apr 10, 2025	Video Understanding	—Unverified	0
How Can Objects Help Video-Language Understanding?	Apr 10, 2025	Image CaptioningObject	—Unverified	0
VideoExpert: Augmented LLM for Temporal-Sensitive Video Understanding	Apr 10, 2025	Instruction FollowingVideo Understanding	—Unverified	0
From Broadcast to Minimap: Achieving State-of-the-Art SoccerNet Game State Reconstruction	Apr 8, 2025	Game State ReconstructionJersey Number Recognition	—Unverified	0
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models	Apr 8, 2025	In-Context LearningInstruction Following	—Unverified	0
InstructionBench: An Instructional Video Understanding Benchmark	Apr 7, 2025	Common Sense ReasoningMultiple-choice	—Unverified	0

Show:10 25 50

← PrevPage 49 of 115Next →

No leaderboard results yet.