SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 10311040 of 1149 papers

TitleStatusHype
Capturing Temporal Information in a Single Frame: Channel Sampling Strategies for Action RecognitionCode0
B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal TokensCode0
In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action RecognitionCode0
ViP: Video Platform for PyTorchCode0
ViQAgent: Zero-Shot Video Question Answering via Agent with Open-Vocabulary Grounding ValidationCode0
Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric VisionCode0
ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval ModelsCode0
https://arxiv.org/abs/2407.00634Code0
How Would The Viewer Feel? Estimating Wellbeing From Video ScenariosCode0
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person ScenariosCode0
Show:102550
← PrevPage 104 of 115Next →

No leaderboard results yet.