SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 621–630 of 1149 papers

Title	Date	Tasks	Status	Hype
CAST: Cross-Attention in Space and Time for Video Action Recognition	Nov 30, 2023	Action ClassificationAction Recognition	CodeCode Available	1
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives	Nov 30, 2023	Video Understanding	CodeCode Available	2
Eliciting In-Context Learning in Vision-Language Models for Videos Through Curated Data Distributional Properties	Nov 28, 2023	In-Context LearningVideo Understanding	CodeCode Available	1
Panoptic Video Scene Graph Generation	Nov 28, 2023	Graph GenerationPanoptic Scene Graph Generation	CodeCode Available	1
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark	Nov 28, 2023	3D Question Answering (3D-QA)Diagnostic	CodeCode Available	2
Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning	Nov 27, 2023	Action ClassificationAction Recognition	CodeCode Available	1
GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation	Nov 25, 2023	Instruction FollowingLanguage Modeling	—Unverified	0
Mug-STAN: Adapting Image-Language Pretrained Models for General Video Understanding	Nov 25, 2023	Video Understanding	CodeCode Available	1
PG-Video-LLaVA: Pixel Grounding Large Video-Language Models	Nov 22, 2023	BenchmarkingPhrase Grounding	CodeCode Available	2
Vamos: Versatile Action Models for Video Understanding	Nov 22, 2023	EgoSchemaHard Attention	CodeCode Available	0

Show:10 25 50

← PrevPage 63 of 115Next →

No leaderboard results yet.