Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 951–1000 of 1149 papers

Title	Date	Tasks	Status
Spatio-Temporal Video Representation Learning for AI Based Video Playback Style Prediction	Oct 3, 2021	Action RecognitionRepresentation Learning	—Unverified
OBJECT DYNAMICS DISTILLATION FOR SCENE DECOMPOSITION AND REPRESENTATION	Sep 29, 2021	ObjectPredict Future Video Frames	—Unverified
Pairwise Emotional Relationship Recognition in Drama Videos: Dataset and Benchmark	Sep 23, 2021	Video Understanding	CodeCode Available
A Multimodal Sentiment Dataset for Video Recommendation	Sep 17, 2021	Multimodal Sentiment AnalysisSentiment Analysis	—Unverified
Overview of Tencent Multi-modal Ads Video Understanding Challenge	Sep 16, 2021	Multi-Label ClassificationMUlTI-LABEL-ClASSIFICATION	—Unverified
Multi-modal Representation Learning for Video Advertisement Content Structuring	Sep 4, 2021	Representation LearningRe-Ranking	—Unverified
Spatio-Temporal Perturbations for Video Attribution	Sep 1, 2021	Video Understanding	CodeCode Available
LIGAR: Lightweight General-purpose Action Recognition	Aug 30, 2021	Action RecognitionGesture Recognition	—Unverified
Identity-aware Graph Memory Network for Action Detection	Aug 26, 2021	Action DetectionGraph Neural Network	—Unverified
Learning an Augmented RGB Representation with Cross-Modal Knowledge Distillation for Action Detection	Aug 8, 2021	Action DetectionKnowledge Distillation	—Unverified
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video Captioning	Aug 5, 2021	AttributeCaption Generation	—Unverified
CogME: A Cognition-Inspired Multi-Dimensional Evaluation Metric for Story Understanding	Jul 21, 2021	Question AnsweringSentence	—Unverified
Spatio-Temporal Context for Action Detection	Jun 29, 2021	Action DetectionVideo Understanding	—Unverified
Discerning Generic Event Boundaries in Long-Form Wild Videos	Jun 18, 2021	Boundary DetectionForm	—Unverified
Long-Short Temporal Contrastive Learning of Video Transformers	Jun 17, 2021	Action RecognitionContrastive Learning	—Unverified
C^3: Compositional Counterfactual Contrastive Learning for Video-grounded Dialogues	Jun 16, 2021	Contrastive Learningcounterfactual	—Unverified
Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition	Jun 9, 2021	Action RecognitionPoint Cloud Classification	—Unverified
Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking	Jun 7, 2021	Graph Neural NetworkMulti-Person Pose Estimation	—Unverified
Transformed ROIs for Capturing Visual Transformations in Videos	Jun 6, 2021	Action RecognitionVideo Understanding	—Unverified
A Study On the Effects of Pre-processing On Spatio-temporal Action Recognition Using Spiking Neural Networks Trained with STDP	May 31, 2021	Action RecognitionSpatio-temporal Action Recognition	—Unverified
Highlight Timestamp Detection Model for Comedy Videos via Multimodal Sentiment Analysis	May 28, 2021	Multimodal Sentiment AnalysisObject Recognition	—Unverified
VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding	May 20, 2021	Action SegmentationLanguage Modeling	—Unverified
Relation-aware Hierarchical Attention Framework for Video Question Answering	May 13, 2021	Question AnsweringRelation	CodeCode Available
Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions	May 10, 2021	Contrastive LearningRetrieval	—Unverified
Skimming and Scanning for Untrimmed Video Action Recognition	Apr 21, 2021	Action RecognitionTemporal Action Localization	—Unverified
Camera Calibration and Player Localization in SoccerNet-v2 and Investigation of their Representations for Action Spotting	Apr 19, 2021	Action SpottingCamera Calibration	—Unverified
Temporal Query Networks for Fine-grained Video Understanding	Apr 19, 2021	Action ClassificationAction Recognition	—Unverified
Temporally smooth online action detection using cycle-consistent future anticipation	Apr 16, 2021	Action DetectionAutonomous Driving	CodeCode Available
Adaptive Intermediate Representations for Video Understanding	Apr 14, 2021	Action ClassificationOptical Flow Estimation	—Unverified
Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation	Apr 10, 2021	Objectobject-detection	—Unverified
FIBER: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework	Apr 9, 2021	Language ModellingMultiple-choice	CodeCode Available
M3L: Language-based Video Editing via Multi-Modal Multi-Level Transformers	Apr 2, 2021	DiagnosticVideo Editing	—Unverified
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation	Mar 30, 2021	Action DetectionTemporal Action Proposal Generation	—Unverified
Unified Graph Structured Models for Video Understanding	Mar 29, 2021	Action DetectionGraph Classification	—Unverified
Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action Localization	Mar 28, 2021	Action ClassificationAction Localization	—Unverified
ClawCraneNet: Leveraging Object-level Relation for Text-based Video Segmentation	Mar 19, 2021	ObjectReferring Expression Segmentation	—Unverified
Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training	Mar 18, 2021	Video Understanding	—Unverified
PcmNet: Position-Sensitive Context Modeling Network for Temporal Action Localization	Mar 9, 2021	Action LocalizationBoundary Detection	—Unverified
Unsupervised Motion Representation Enhanced Network for Action Recognition	Mar 5, 2021	Action RecognitionOptical Flow Estimation	—Unverified
Win-Fail Action Recognition	Feb 15, 2021	Action RecognitionAction Understanding	CodeCode Available
CAG-QIL: Context-Aware Actionness Grouping via Q Imitation Learning for Online Temporal Action Localization	Jan 1, 2021	Action LocalizationImitation Learning	—Unverified
Global Self-Attention Networks	Jan 1, 2021	Video Understanding	—Unverified
Cross-Attentional Audio-Visual Fusion for Weakly-Supervised Action Localization	Jan 1, 2021	Action LocalizationVideo Understanding	—Unverified
Attention Is Not Enough: Mitigating the Distribution Discrepancy in Asynchronous Multimodal Sequence Fusion	Jan 1, 2021	Time SeriesTime Series Analysis	—Unverified
Understanding Action Sequences based on Video Captioning for Learning-from-Observation	Dec 9, 2020	Video CaptioningVideo Understanding	—Unverified
t-EVA: Time-Efficient t-SNE Video Annotation	Nov 26, 2020	Dimensionality ReductionVideo Classification	—Unverified
Can Temporal Information Help with Contrastive Self-Supervised Learning?	Nov 25, 2020	Data AugmentationRepresentation Learning	—Unverified
Cycle-Contrast for Self-Supervised Video Representation Learning	Oct 28, 2020	Action RecognitionContrastive Learning	—Unverified
Co-attentional Transformers for Story-Based Video Understanding	Oct 27, 2020	Question AnsweringVideo Question Answering	—Unverified
Egok360: A 360 Egocentric Kinetic Human Activity Video Dataset	Oct 15, 2020	Activity RecognitionEgocentric Activity Recognition	—Unverified

Show:10 25 50

← PrevPage 20 of 23Next →

No leaderboard results yet.