Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 801–850 of 1149 papers

Title	Date	Tasks	Status	Hype
CVNets: High Performance Library for Computer Vision	Jun 4, 2022	Video UnderstandingVocal Bursts Intensity Prediction	CodeCode Available	6
Development of a MultiModal Annotation Framework and Dataset for Deep Video Understanding	Jun 1, 2022	Knowledge GraphsVideo Understanding	—Unverified	0
From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering	May 30, 2022	counterfactualDescriptive	CodeCode Available	1
Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions	May 19, 2022	Contrastive LearningSelf-Supervised Learning	CodeCode Available	1
ETAD: Training Action Detection End to End on a Laptop	May 14, 2022	Action DetectionGPU	CodeCode Available	1
BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection	May 5, 2022	Action Detectionobject-detection	CodeCode Available	1
i-Code: An Integrative and Composable Multimodal Learning Framework	May 3, 2022	Contrastive LearningVideo Understanding	—Unverified	0
Overview of the MedVidQA 2022 Shared Task on Medical Video Question-Answering	May 1, 2022	Question AnsweringVideo Classification	—Unverified	0
Flamingo: a Visual Language Model for Few-Shot Learning	Apr 29, 2022	Few-Shot LearningGenerative Visual Question Answering	CodeCode Available	4
Causal Reasoning Meets Visual Representation Learning: A Prospective Study	Apr 26, 2022	BenchmarkingOut-of-Distribution Generalization	—Unverified	0
Contrastive Language-Action Pre-training for Temporal Localization	Apr 26, 2022	Action LocalizationContrastive Learning	—Unverified	0
Revealing Occlusions with 4D Neural Fields	Apr 22, 2022	Video Understanding	—Unverified	0
A Multi-Person Video Dataset Annotation Method of Spatio-Temporally Actions	Apr 21, 2022	Action DetectionVideo Understanding	CodeCode Available	1
Less than Few: Self-Shot Video Instance Segmentation	Apr 19, 2022	Few-Shot LearningInstance Segmentation	—Unverified	0
ActAR: Actor-Driven Pose Embeddings for Video Action Recognition	Apr 19, 2022	Action RecognitionOptical Flow Estimation	—Unverified	0
Adversarial Machine Learning Attacks Against Video Anomaly Detection Systems	Apr 7, 2022	Anomaly DetectionBIG-bench Machine Learning	—Unverified	0
MM-SEAL: A Large-scale Video Dataset of Multi-person Multi-grained Spatio-temporally Action Localization	Apr 6, 2022	Action LocalizationAction Recognition	—Unverified	0
Temporal Alignment Networks for Long-term Video	Apr 6, 2022	Action RecognitionAction Segmentation	CodeCode Available	1
An Empirical Study of End-to-End Temporal Action Detection	Apr 6, 2022	Action ClassificationAction Detection	CodeCode Available	1
Long Movie Clip Classification with State-Space Video Models	Apr 4, 2022	ClassificationDecoder	CodeCode Available	1
PYSKL: a toolbox for skeleton-based video understanding	Apr 2, 2022	Skeleton Based Action RecognitionVideo Understanding	—Unverified	0
SPAct: Self-supervised Privacy Preservation for Action Recognition	Mar 29, 2022	Action ClassificationAction Recognition	CodeCode Available	1
How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning?	Mar 27, 2022	Self-Supervised LearningSensitivity	CodeCode Available	1
FitCLIP: Refining Large-Scale Pretrained Image-Text Models for Zero-Shot Video Understanding Tasks	Mar 24, 2022	Action RecognitionRetrieval	CodeCode Available	0
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training	Mar 23, 2022	4kAction Classification	CodeCode Available	3
On the Pitfalls of Batch Normalization for End-to-End Video Learning: A Study on Surgical Workflow Analysis	Mar 15, 2022	Video Understanding	CodeCode Available	0
Human Gaze Guided Attention for Surgical Activity Recognition	Mar 9, 2022	Activity RecognitionVideo Understanding	—Unverified	0
Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding	Mar 8, 2022	Contrastive LearningSentence	—Unverified	0
Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection	Mar 1, 2022	AvgBoundary Detection	—Unverified	0
Domain Knowledge-Informed Self-Supervised Representations for Workout Form Assessment	Feb 28, 2022	3D Action RecognitionAction Analysis	CodeCode Available	1
Concept Graph Neural Networks for Surgical Video Understanding	Feb 27, 2022	Video Understanding	—Unverified	0
Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations	Feb 21, 2022	Answer GenerationVideo Understanding	—Unverified	0
ActionFormer: Localizing Moments of Actions with Transformers	Feb 16, 2022	Action LocalizationAction Recognition	CodeCode Available	2
Learning Optical Flow with Adaptive Graph Reasoning	Feb 8, 2022	Motion EstimationOptical Flow Estimation	CodeCode Available	1
A Coding Framework and Benchmark towards Low-Bitrate Video Understanding	Feb 6, 2022	Video CompressionVideo Understanding	CodeCode Available	0
A Dataset for Medical Instructional Video Classification and Question Answering	Jan 30, 2022	ClassificationQuestion Answering	CodeCode Available	1
Capturing Temporal Information in a Single Frame: Channel Sampling Strategies for Action Recognition	Jan 25, 2022	Action RecognitionOptical Flow Estimation	CodeCode Available	0
End-to-end Generative Pretraining for Multimodal Video Captioning	Jan 20, 2022	Action ClassificationDecoder	—Unverified	0
Multiview Transformers for Video Recognition	Jan 12, 2022	Action ClassificationAction Recognition	—Unverified	0
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound	Jan 7, 2022	Action ClassificationNavigate	—Unverified	0
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding	Jan 3, 2022	SentenceTemporal Sentence Grounding	—Unverified	0
Recurring the Transformer for Video Action Recognition	Jan 1, 2022	Action RecognitionGPU	—Unverified	0
Improving Video Model Transfer With Dynamic Representation Learning	Jan 1, 2022	Action ClassificationKnowledge Distillation	—Unverified	0
YouMVOS: An Actor-Centric Multi-Shot Video Object Segmentation Dataset	Jan 1, 2022	ManagementSegmentation	—Unverified	0
UBoCo: Unsupervised Boundary Contrastive Learning for Generic Event Boundary Detection	Jan 1, 2022	Boundary DetectionContrastive Learning	—Unverified	0
VRDFormer: End-to-End Video Visual Relation Detection With Transformers	Jan 1, 2022	ObjectRelation	—Unverified	0
Video Joint Modelling Based on Hierarchical Transformer for Co-summarization	Dec 27, 2021	RetrievalSupervised Video Summarization	CodeCode Available	1
Exploiting Long-Term Dependencies for Generating Dynamic Scene Graphs	Dec 18, 2021	Graph GenerationObject	CodeCode Available	0
Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation	Dec 16, 2021	Contrastive LearningRepresentation Learning	CodeCode Available	1
Discrete neural representations for explainable anomaly detection	Dec 10, 2021	Anomaly DetectionObject	—Unverified	0

Show:10 25 50

← PrevPage 17 of 23Next →

No leaderboard results yet.