SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 501–510 of 1149 papers

Title	Date	Tasks	Status	Hype
Temporal Grounding of Activities using Multimodal Large Language Models	May 30, 2024	Video Understanding	—Unverified	0
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark	May 30, 2024	DeepFake DetectionMamba	CodeCode Available	2
EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos	May 30, 2024	Action RecognitionSurgical phase recognition	CodeCode Available	1
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos	May 29, 2024	EgoSchemaMME	CodeCode Available	2
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions	May 28, 2024	Action RecognitionVideo Recognition	—Unverified	0
MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning	May 28, 2024	Decision MakingVideo Understanding	—Unverified	0
Hawk: Learning to Understand Open-World Video Anomalies	May 27, 2024	Anomaly DetectionQuestion Answering	CodeCode Available	3
Streaming Long Video Understanding with Large Language Models	May 25, 2024	Question AnsweringVideo Understanding	—Unverified	0
MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models	May 23, 2024	Action RecognitionAction Segmentation	—Unverified	0
Dense Connector for MLLMs	May 22, 2024	Video Understanding	CodeCode Available	2

Show:10 25 50

← PrevPage 51 of 115Next →

No leaderboard results yet.