SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 901–910 of 1149 papers

Title	Date	Tasks	Status	Hype
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding	Apr 15, 2025	Semantic SegmentationVideo Generation	—Unverified	0
OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding	Apr 20, 2025	Language ModelingLanguage Modelling	—Unverified	0
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling	Jul 19, 2019	BenchmarkingMotion Estimation	—Unverified	0
On the Limitations of Vision-Language Models in Understanding Image Transforms	Mar 12, 2025	Question AnsweringVideo Generation	—Unverified	0
Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive Prompting	Apr 26, 2024	Facial Expression RecognitionMulti-Task Learning	—Unverified	0
Open Vocabulary Multi-Label Video Classification	Jul 12, 2024	Action ClassificationClassification	—Unverified	0
Open-Vocabulary Spatio-Temporal Action Detection	May 17, 2024	Action DetectionFine-Grained Action Detection	—Unverified	0
Optimizing GPT for Video Understanding: Zero-Shot Performance and Prompt Engineering	Feb 13, 2025	ClassificationPrompt Engineering	—Unverified	0
Overview of Tencent Multi-modal Ads Video Understanding Challenge	Sep 16, 2021	Multi-Label ClassificationMUlTI-LABEL-ClASSIFICATION	—Unverified	0
Overview of the MedVidQA 2022 Shared Task on Medical Video Question-Answering	May 1, 2022	Question AnsweringVideo Classification	—Unverified	0

Show:10 25 50

← PrevPage 91 of 115Next →

No leaderboard results yet.