SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 71–80 of 1149 papers

Title	Date	Tasks	Status	Hype
Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding	May 23, 2025	FormQuestion Answering	—Unverified	0
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design	May 22, 2025	CPUGPU	CodeCode Available	2
SoccerChat: Integrating Multimodal Data for Enhanced Soccer Game Understanding	May 22, 2025	Action ClassificationAutomatic Speech Recognition	CodeCode Available	0
Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning	May 22, 2025	Misinformationreinforcement-learning	CodeCode Available	1
Four Eyes Are Better Than Two: Harnessing the Collaborative Potential of Large Models via Differentiated Thinking and Complementary Ensembles	May 22, 2025	EgoSchemaFew-Shot Learning	—Unverified	0
ViQAgent: Zero-Shot Video Question Answering via Agent with Open-Vocabulary Grounding Validation	May 21, 2025	Decision MakingLanguage Modeling	CodeCode Available	0
ViaRL: Adaptive Temporal Grounding via Visual Iterated Amplification Reinforcement Learning	May 21, 2025	Pseudo LabelReinforcement Learning (RL)	—Unverified	0
Leveraging Foundation Models for Multimodal Graph-Based Action Recognition	May 21, 2025	Action RecognitionGraph Attention	—Unverified	0
LiveVLM: Efficient Online Video Understanding via Streaming-Oriented KV Cache and Retrieval	May 21, 2025	Autonomous DrivingQuestion Answering	—Unverified	0
Clapper: Compact Learning and Video Representation in VLMs	May 21, 2025	Video Understanding	—Unverified	0

Show:10 25 50

← PrevPage 8 of 115Next →

No leaderboard results yet.