SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 541550 of 1149 papers

TitleStatusHype
Towards Fine-Grained Video Question Answering0
Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection0
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models0
PreMind: Multi-Agent Video Understanding for Advanced Indexing of Presentation-style Videos0
M-LLM Based Video Frame Selection for Efficient Video Understanding0
InternVQA: Advancing Compressed Video Quality Assessment with Distilling Large Foundation Model0
An Analysis of Data Transformation Effects on Segment Anything 20
Fine-Grained Video Captioning through Scene Graph Consolidation0
LongCaptioning: Unlocking the Power of Long Caption Generation in Large Multimodal Models0
AVD2: Accident Video Diffusion for Accident Video Description0
Show:102550
← PrevPage 55 of 115Next →

No leaderboard results yet.