SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 121–130 of 1149 papers

Title	Date	Tasks	Status	Hype
Multimodal Long Video Modeling Based on Temporal Dynamic Context	Apr 14, 2025	Video Understanding	CodeCode Available	1
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning	Apr 13, 2025	Question Answeringreinforcement-learning	CodeCode Available	2
F^3Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos	Apr 11, 2025	Action UnderstandingEvent Detection	CodeCode Available	1
Towards Efficient and Robust Moment Retrieval System: A Unified Framework for Multi-Granularity Models and Temporal Reranking	Apr 11, 2025	Moment RetrievalQuestion Answering	—Unverified	0
How Can Objects Help Video-Language Understanding?	Apr 10, 2025	Image CaptioningObject	—Unverified	0
VideoExpert: Augmented LLM for Temporal-Sensitive Video Understanding	Apr 10, 2025	Instruction FollowingVideo Understanding	—Unverified	0
SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding	Apr 10, 2025	Video Understanding	—Unverified	0
VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning	Apr 9, 2025	MVBenchObject Tracking	CodeCode Available	3
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models	Apr 8, 2025	In-Context LearningInstruction Following	—Unverified	0
From Broadcast to Minimap: Achieving State-of-the-Art SoccerNet Game State Reconstruction	Apr 8, 2025	Game State ReconstructionJersey Number Recognition	—Unverified	0

Show:10 25 50

← PrevPage 13 of 115Next →

No leaderboard results yet.