SOTAVerified

Visual Storytelling

( Image credit: No Metrics Are Perfect )

Papers

Showing 4150 of 115 papers

TitleStatusHype
Discourse Analysis for Evaluating Coherence in Video Paragraph Captions0
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks0
DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention0
DiffuVST: Narrating Fictional Scenes with Global-History-Guided Denoising Models0
BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling0
Induction and Reference of Entities in a Visual Story0
Incorporating Textual Evidence in Visual Storytelling0
Improving Visual Storytelling with Multimodal Large Language Models0
DANTE-AD: Dual-Vision Attention Network for Long-Term Audio Description0
A System for Image Understanding using Sensemaking and Narrative0
Show:102550
← PrevPage 5 of 12Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GLAC NetMETEOR30.14Unverified
2HEGRBLEU-416.7Unverified
3HBSGBLEU-415.4Unverified
4IRWBLEU-415.4Unverified
5CoVSBLEU-415.2Unverified
6SGEmbBLEU-414.8Unverified
7SentiStoryBLEU-414.8Unverified
8SGVSTBLEU-414.7Unverified
9INetBLEU-414.7Unverified
10TAVST (RL)BLEU-414.6Unverified