SOTAVerified

Visual Storytelling

( Image credit: No Metrics Are Perfect )

Papers

Showing 1120 of 115 papers

TitleStatusHype
Storybooth: Training-free Multi-Subject Consistency for Improved Visual Storytelling0
DANTE-AD: Dual-Vision Attention Network for Long-Term Audio Description0
VinaBench: Benchmark for Faithful and Consistent Visual Narratives0
MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks0
Toyteller: AI-powered Visual Storytelling Through Toy-Playing with Character Symbols0
Generative Visual Communication in the Era of Vision-Language Models0
FlipSketch: Flipping Static Drawings to Text-Guided Sketch AnimationsCode3
A Comprehensive Survey and Guide to Multimodal Large Language Models in Vision-Language Tasks0
KAHANI: Culturally-Nuanced Visual Storytelling Pipeline for Non-Western Cultures0
Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and BeyondCode0
Show:102550
← PrevPage 2 of 12Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GLAC NetMETEOR30.14Unverified
2HEGRBLEU-416.7Unverified
3HBSGBLEU-415.4Unverified
4IRWBLEU-415.4Unverified
5CoVSBLEU-415.2Unverified
6SGEmbBLEU-414.8Unverified
7SentiStoryBLEU-414.8Unverified
8SGVSTBLEU-414.7Unverified
9INetBLEU-414.7Unverified
10TAVST (RL)BLEU-414.6Unverified