SOTAVerified

Visual Storytelling

( Image credit: No Metrics Are Perfect )

Papers

Showing 101115 of 115 papers

TitleStatusHype
A Comprehensive Survey and Guide to Multimodal Large Language Models in Vision-Language Tasks0
DANTE-AD: Dual-Vision Attention Network for Long-Term Audio Description0
Ordered Attention for Coherent Visual Storytelling0
DiffuVST: Narrating Fictional Scenes with Global-History-Guided Denoising Models0
DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention0
Discourse Analysis for Evaluating Coherence in Video Paragraph Captions0
Diverse and Relevant Visual Storytelling with Scene Graph Embeddings0
Dixit: Interactive Visual Storytelling via Term Manipulation0
Towards Coherent Visual Storytelling with Ordered Image Attention0
Camera Trajectory Generation: A Comprehensive Survey of Methods, Metrics, and Future Directions0
Toyteller: AI-powered Visual Storytelling Through Toy-Playing with Character Symbols0
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks0
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography0
Generating Visual Stories with Grounded and Coreferent Characters0
Generative Visual Communication in the Era of Vision-Language Models0
Show:102550
← PrevPage 5 of 5Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GLAC NetMETEOR30.14Unverified
2HEGRBLEU-416.7Unverified
3HBSGBLEU-415.4Unverified
4IRWBLEU-415.4Unverified
5CoVSBLEU-415.2Unverified
6SGEmbBLEU-414.8Unverified
7SentiStoryBLEU-414.8Unverified
8SGVSTBLEU-414.7Unverified
9INetBLEU-414.7Unverified
10TAVST (RL)BLEU-414.6Unverified