SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 10511060 of 1149 papers

TitleStatusHype
Recurring the Transformer for Video Action Recognition0
Relational Space-Time Query in Long-Form Videos0
Residual Frames with Efficient Pseudo-3D CNN for Human Action Recognition0
ResNetVLLM -- Multi-modal Vision LLM for the Video Understanding Task0
Rethinking Image-to-Video Adaptation: An Object-centric Perspective0
Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data0
Retrieval-based Video Language Model for Efficient Long Video Question Answering0
RETTA: Retrieval-Enhanced Test-Time Adaptation for Zero-Shot Video Captioning0
Revealing Occlusions with 4D Neural Fields0
Revisiting Kernel Temporal Segmentation as an Adaptive Tokenizer for Long-form Video Understanding0
Show:102550
← PrevPage 106 of 115Next →

No leaderboard results yet.