SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 861870 of 1149 papers

TitleStatusHype
Mimic The Raw Domain: Accelerating Action Recognition in the Compressed Domain0
M-LLM Based Video Frame Selection for Efficient Video Understanding0
MLVTG: Mamba-Based Feature Alignment and LLM-Driven Purification for Multi-Modal Video Temporal Grounding0
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning0
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding0
MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning0
MM-Ego: Towards Building Egocentric Multimodal LLMs0
Moment Quantization for Video Temporal Grounding0
MomentSeeker: A Task-Oriented Benchmark For Long-Video Moment Retrieval0
Morph: Flexible Acceleration for 3D CNN-based Video Understanding0
Show:102550
← PrevPage 87 of 115Next →

No leaderboard results yet.