SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 221–230 of 1149 papers

Title	Date	Tasks	Status	Hype
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding	Jun 28, 2024	Multiple-choiceVideo Understanding	CodeCode Available	1
Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads	Jun 27, 2024	Diversityimage-classification	CodeCode Available	1
Snakes and Ladders: Two Steps Up for VideoMamba	Jun 27, 2024	Action RecognitionMamba	CodeCode Available	1
Towards Event-oriented Long Video Understanding	Jun 20, 2024	Video Understanding	CodeCode Available	1
AlanaVLM: A Multimodal Embodied AI Foundation Model for Egocentric Video Understanding	Jun 19, 2024	Question AnsweringSpatial Reasoning	CodeCode Available	1
Slot State Space Models	Jun 18, 2024	MambaState Space Models	CodeCode Available	1
VideoVista: A Versatile Benchmark for Video Understanding and Reasoning	Jun 17, 2024	Anomaly DetectionLogical Reasoning	CodeCode Available	1
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos	Jun 12, 2024	counterfactualFuture prediction	CodeCode Available	1
Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos	Jun 3, 2024	Mistake DetectionOnline Mistake Detection	CodeCode Available	1
EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos	May 30, 2024	Action RecognitionSurgical phase recognition	CodeCode Available	1

Show:10 25 50

← PrevPage 23 of 115Next →

No leaderboard results yet.