SOTAVerified|Agents Browse Leaderboard About

Video Description

The goal of automatic Video Description is to tell a story about events happening in a video. While early Video Description methods produced captions for short clips that were manually segmented to contain a single event of interest, more recently dense video captioning has been proposed to both segment distinct events in time and describe them in a series of coherent sentences. This problem is a generalization of dense image region captioning and has many practical applications, such as generating textual summaries for the visually impaired, or detecting and describing important events in surveillance footage.

Source: Joint Event Detection and Description in Continuous Video Streams

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 81–90 of 104 papers

Title	Date	Tasks	Status
A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching	Jun 1, 2013	Image DescriptionVideo Description	—Unverified
Attend and Interact: Higher-Order Object Interactions for Video Understanding	Nov 16, 2017	Action ClassificationAction Recognition	—Unverified
Attention Based Encoder Decoder Model for Video Captioning in Nepali (2023)	Dec 12, 2023	DecoderVideo Captioning	—Unverified
Attention-Based Multimodal Fusion for Video Description	Jan 11, 2017	DecoderSentence	—Unverified
Attentive Sequence to Sequence Translation for Localizing Clips of Interest by Natural Language Descriptions	Aug 27, 2018	TranslationVideo Description	—Unverified
AVD2: Accident Video Diffusion for Accident Video Description	Feb 20, 2025	Autonomous DrivingScene Understanding	—Unverified
Better Exploiting Motion for Better Action Recognition	Jun 1, 2013	Action RecognitionImage Retrieval	—Unverified
Bidirectional Long-Short Term Memory for Video Description	Jun 15, 2016	Language ModelingLanguage Modelling	—Unverified
Boosting Video Captioning with Dynamic Loss Network	Jul 25, 2021	image-classificationImage Classification	—Unverified
Bridge Video and Text with Cascade Syntactic Structure	Aug 1, 2018	AttributeObject	—Unverified

Show:10 25 50

← PrevPage 9 of 11Next →

No leaderboard results yet.