SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 91–100 of 1149 papers

Title	Date	Tasks	Status	Hype
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark	Nov 28, 2023	3D Question Answering (3D-QA)Diagnostic	CodeCode Available	2
ActionFormer: Localizing Moments of Actions with Transformers	Feb 16, 2022	Action LocalizationAction Recognition	CodeCode Available	2
Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs	Jun 13, 2024	BenchmarkingQuestion Answering	CodeCode Available	2
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future	Jul 18, 2023	Knowledge Distillationobject-detection	CodeCode Available	2
Multi-granularity Correspondence Learning from Long-term Noisy Videos	Jan 30, 2024	Action SegmentationLong Video Retrieval (Background Removed)	CodeCode Available	2
PruneVid: Visual Token Pruning for Efficient Video Large Language Models	Dec 20, 2024	Video Understanding	CodeCode Available	2
PyTorchVideo: A Deep Learning Library for Video Understanding	Nov 18, 2021	Deep LearningSelf-Supervised Learning	CodeCode Available	2
Query-Dependent Video Representation for Moment Retrieval and Highlight Detection	Mar 24, 2023	Highlight DetectionMoment Retrieval	CodeCode Available	2
AIM: Adapting Image Models for Efficient Video Action Recognition	Feb 6, 2023	Action ClassificationAction Recognition	CodeCode Available	2
Neptune: The Long Orbit to Benchmarking Long Video Understanding	Dec 12, 2024	BenchmarkingMultimodal Reasoning	CodeCode Available	2

Show:10 25 50

← PrevPage 10 of 115Next →

No leaderboard results yet.