SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 411420 of 1149 papers

TitleStatusHype
Dynamic Multistep Reasoning based on Video Scene Graph for Video Question Answering0
Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding0
Dynamic Graph Modules for Modeling Object-Object Interactions in Activity Recognition0
Dynamic Appearance: A Video Representation for Action Recognition with Joint Training0
Beyond the Camera: Neural Networks in World Coordinates0
Analyzing Zero-Shot Abilities of Vision-Language Models on Video Understanding Tasks0
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs0
DualX-VSR: Dual Axial SpatialTemporal Transformer for Real-World Video Super-Resolution without Motion Compensation0
Beyond still images: Temporal features and input variance resilience0
DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM0
Show:102550
← PrevPage 42 of 115Next →

No leaderboard results yet.