SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 481–490 of 1149 papers

Title	Date	Tasks	Status	Hype
GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding	Jun 14, 2024	Activity RecognitionMMR total	—Unverified	0
Localizing Events in Videos with Multimodal Queries	Jun 14, 2024	Natural Language QueriesVideo Understanding	—Unverified	0
Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs	Jun 13, 2024	BenchmarkingQuestion Answering	CodeCode Available	2
LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living	Jun 13, 2024	BenchmarkingHuman-Object Interaction Detection	—Unverified	0
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding	Jun 13, 2024	Dense Video CaptioningMVBench	CodeCode Available	3
Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams	Jun 12, 2024	cross-modal alignmentLanguage Modelling	CodeCode Available	3
LVBench: An Extreme Long Video Understanding Benchmark	Jun 12, 2024	Decision MakingVideo Understanding	CodeCode Available	2
Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models	Jun 12, 2024	Video Understanding	—Unverified	0
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos	Jun 12, 2024	counterfactualFuture prediction	CodeCode Available	1
MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD	Jun 11, 2024	Video RecognitionVideo Understanding	—Unverified	0

Show:10 25 50

← PrevPage 49 of 115Next →

No leaderboard results yet.