SOTAVerified

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Showing 801825 of 1149 papers

TitleStatusHype
CVNets: High Performance Library for Computer VisionCode6
Development of a MultiModal Annotation Framework and Dataset for Deep Video Understanding0
From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-AnsweringCode1
Free Lunch for Surgical Video Understanding by Distilling Self-SupervisionsCode1
ETAD: Training Action Detection End to End on a LaptopCode1
BasicTAD: an Astounding RGB-Only Baseline for Temporal Action DetectionCode1
i-Code: An Integrative and Composable Multimodal Learning Framework0
Overview of the MedVidQA 2022 Shared Task on Medical Video Question-Answering0
Flamingo: a Visual Language Model for Few-Shot LearningCode4
Causal Reasoning Meets Visual Representation Learning: A Prospective Study0
Contrastive Language-Action Pre-training for Temporal Localization0
Revealing Occlusions with 4D Neural Fields0
A Multi-Person Video Dataset Annotation Method of Spatio-Temporally ActionsCode1
Less than Few: Self-Shot Video Instance Segmentation0
ActAR: Actor-Driven Pose Embeddings for Video Action Recognition0
Adversarial Machine Learning Attacks Against Video Anomaly Detection Systems0
MM-SEAL: A Large-scale Video Dataset of Multi-person Multi-grained Spatio-temporally Action Localization0
Temporal Alignment Networks for Long-term VideoCode1
An Empirical Study of End-to-End Temporal Action DetectionCode1
Long Movie Clip Classification with State-Space Video ModelsCode1
PYSKL: a toolbox for skeleton-based video understanding0
SPAct: Self-supervised Privacy Preservation for Action RecognitionCode1
How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning?Code1
FitCLIP: Refining Large-Scale Pretrained Image-Text Models for Zero-Shot Video Understanding TasksCode0
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-TrainingCode3
Show:102550
← PrevPage 33 of 46Next →

No leaderboard results yet.