SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 901–925 of 1149 papers

Title	Date	Tasks	Status
Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens	Jun 13, 2022	Action RecognitionVideo Understanding	—Unverified
Efficient Annotation and Learning for 3D Hand Pose Estimation: A Survey	Jun 5, 2022	3D Hand Pose EstimationDomain Adaptation	—Unverified
Development of a MultiModal Annotation Framework and Dataset for Deep Video Understanding	Jun 1, 2022	Knowledge GraphsVideo Understanding	—Unverified
i-Code: An Integrative and Composable Multimodal Learning Framework	May 3, 2022	Contrastive LearningVideo Understanding	—Unverified
Overview of the MedVidQA 2022 Shared Task on Medical Video Question-Answering	May 1, 2022	Question AnsweringVideo Classification	—Unverified
Contrastive Language-Action Pre-training for Temporal Localization	Apr 26, 2022	Action LocalizationContrastive Learning	—Unverified
Causal Reasoning Meets Visual Representation Learning: A Prospective Study	Apr 26, 2022	BenchmarkingOut-of-Distribution Generalization	—Unverified
Revealing Occlusions with 4D Neural Fields	Apr 22, 2022	Video Understanding	—Unverified
Less than Few: Self-Shot Video Instance Segmentation	Apr 19, 2022	Few-Shot LearningInstance Segmentation	—Unverified
ActAR: Actor-Driven Pose Embeddings for Video Action Recognition	Apr 19, 2022	Action RecognitionOptical Flow Estimation	—Unverified
Adversarial Machine Learning Attacks Against Video Anomaly Detection Systems	Apr 7, 2022	Anomaly DetectionBIG-bench Machine Learning	—Unverified
MM-SEAL: A Large-scale Video Dataset of Multi-person Multi-grained Spatio-temporally Action Localization	Apr 6, 2022	Action LocalizationAction Recognition	—Unverified
PYSKL: a toolbox for skeleton-based video understanding	Apr 2, 2022	Skeleton Based Action RecognitionVideo Understanding	—Unverified
FitCLIP: Refining Large-Scale Pretrained Image-Text Models for Zero-Shot Video Understanding Tasks	Mar 24, 2022	Action RecognitionRetrieval	CodeCode Available
On the Pitfalls of Batch Normalization for End-to-End Video Learning: A Study on Surgical Workflow Analysis	Mar 15, 2022	Video Understanding	CodeCode Available
Human Gaze Guided Attention for Surgical Activity Recognition	Mar 9, 2022	Activity RecognitionVideo Understanding	—Unverified
Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding	Mar 8, 2022	Contrastive LearningSentence	—Unverified
Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection	Mar 1, 2022	AvgBoundary Detection	—Unverified
Concept Graph Neural Networks for Surgical Video Understanding	Feb 27, 2022	Video Understanding	—Unverified
Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations	Feb 21, 2022	Answer GenerationVideo Understanding	—Unverified
A Coding Framework and Benchmark towards Low-Bitrate Video Understanding	Feb 6, 2022	Video CompressionVideo Understanding	CodeCode Available
Capturing Temporal Information in a Single Frame: Channel Sampling Strategies for Action Recognition	Jan 25, 2022	Action RecognitionOptical Flow Estimation	CodeCode Available
End-to-end Generative Pretraining for Multimodal Video Captioning	Jan 20, 2022	Action ClassificationDecoder	—Unverified
Multiview Transformers for Video Recognition	Jan 12, 2022	Action ClassificationAction Recognition	—Unverified
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound	Jan 7, 2022	Action ClassificationNavigate	—Unverified

Show:10 25 50

← PrevPage 37 of 46Next →

No leaderboard results yet.