SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 601625 of 1149 papers

TitleStatusHype
What can Off-the-Shelves Large Multi-Modal Models do for Dynamic Scene Graph Generation?0
What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets0
When Work Matters: Transforming Classical Network Structures to Graph CNN0
WildQA: In-the-Wild Video Question Answering0
Wolf: Captioning Everything with a World Summarization Framework0
WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning0
WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs0
X-LeBench: A Benchmark for Extremely Long Egocentric Video Understanding0
YouMVOS: An Actor-Centric Multi-Shot Video Object Segmentation Dataset0
YouTube-8M Video Understanding Challenge Approach and Applications0
ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection0
Zero-shot Action Localization via the Confidence of Large Vision-Language Models0
Zero-Shot Action Recognition in Surveillance Videos0
Zero-Shot Action Recognition in Videos: A Survey0
Zero-Shot Long-Form Video Understanding through Screenplay0
Zero-shot Shark Tracking and Biometrics from Aerial Imagery0
Hierarchical Video Frame Sequence Representation with Deep Convolutional Graph Network0
Zero-Shot Video Question Answering with Procedural Programs0
1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation0
Multimodal Fusion and Coherence Modeling for Video Topic Segmentation0
FE-Adapter: Adapting Image-based Emotion Classifiers to Videos0
An Analysis of Data Transformation Effects on Segment Anything 20
PreMind: Multi-Agent Video Understanding for Advanced Indexing of Presentation-style Videos0
2nd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation0
3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark0
Show:102550
← PrevPage 25 of 46Next →

No leaderboard results yet.