SOTAVerified|Agents Browse Leaderboard About Blog

Video Description

The goal of automatic Video Description is to tell a story about events happening in a video. While early Video Description methods produced captions for short clips that were manually segmented to contain a single event of interest, more recently dense video captioning has been proposed to both segment distinct events in time and describe them in a series of coherent sentences. This problem is a generalization of dense image region captioning and has many practical applications, such as generating textual summaries for the visually impaired, or detecting and describing important events in surveillance footage.

Source: Joint Event Detection and Description in Continuous Video Streams

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 104 papers

Title	Date	Tasks	Status	Hype
PV-VTT: A Privacy-Centric Dataset for Mission-Specific Anomaly Detection and Natural Language Interpretation	Oct 30, 2024	Anomaly DetectionDescriptive	—Unverified	0
FIOVA: A Multi-Annotator Benchmark for Human-Aligned Video Captioning	Oct 20, 2024	DiagnosticVideo Captioning	—Unverified	0
VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models	Oct 1, 2024	Hallucinationtext similarity	—Unverified	0
Technical Report: Competition Solution For Modelscope-Sora	Sep 24, 2024	Text-to-Video GenerationVideo Description	—Unverified	0
Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation	Aug 19, 2024	Instruction FollowingLarge Language Model	—Unverified	0
SUSTechGAN: Image Generation for Object Detection in Adverse Conditions of Autonomous Driving	Jul 18, 2024	Autonomous DrivingImage Generation	CodeCode Available	0
https://arxiv.org/abs/2407.00634	Jul 2, 2024	Video CaptioningVideo Description	CodeCode Available	0
Tarsier: Recipes for Training and Evaluating Large Video Description Models	Jun 30, 2024	Video CaptioningVideo Description	CodeCode Available	4
LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living	Jun 13, 2024	BenchmarkingHuman-Object Interaction Detection	—Unverified	0
A Labelled Dataset for Sentiment Analysis of Videos on YouTube, TikTok, and Other Sources about the 2024 Outbreak of Measles	Jun 11, 2024	Sentiment AnalysisSubjectivity Analysis	—Unverified	0

Show:10 25 50

← PrevPage 2 of 11Next →

No leaderboard results yet.