SOTAVerified|Agents Browse Leaderboard About Blog

video narration captioning

Human narration is another critical factor to understand a multi-shot video. It often provides information of the background knowledge and commentator’s view on visual events. We conduct experiments to predict the narration caption of a video-shot and name this task single-shot narration captioning. We adopt the same model structure as single-shot video captioning with the ASR text as additional input, except that the prediction target is the narration caption.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–1 of 1 papers

Title	Date	Tasks	Status	Hype
Shot2Story20K: A New Benchmark for Comprehensive Understanding of Multi-shot Videos	Dec 16, 2023	Video Captioningvideo narration captioning	CodeCode Available	1

Show:10 25 50

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Ours	BLEU-4	18.8	—	Unverified