SOTAVerified

Visual Storytelling

( Image credit: No Metrics Are Perfect )

Papers

Showing 150 of 115 papers

TitleStatusHype
Shape2Animal: Creative Animal Generation from Natural Silhouettes0
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent0
Consistent Story Generation with Asymmetry Zigzag SamplingCode0
Camera Trajectory Generation: A Comprehensive Survey of Methods, Metrics, and Future Directions0
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers0
Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts0
StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story GenerationCode1
VIST-GPT: Ushering in the Era of Visual Storytelling with LLMs?0
FLIP Reasoning ChallengeCode0
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography0
Storybooth: Training-free Multi-Subject Consistency for Improved Visual Storytelling0
DANTE-AD: Dual-Vision Attention Network for Long-Term Audio Description0
VinaBench: Benchmark for Faithful and Consistent Visual Narratives0
MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks0
Toyteller: AI-powered Visual Storytelling Through Toy-Playing with Character Symbols0
Generative Visual Communication in the Era of Vision-Language Models0
FlipSketch: Flipping Static Drawings to Text-Guided Sketch AnimationsCode3
A Comprehensive Survey and Guide to Multimodal Large Language Models in Vision-Language Tasks0
KAHANI: Culturally-Nuanced Visual Storytelling Pipeline for Non-Western Cultures0
Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and BeyondCode0
Generating Visual Stories with Grounded and Coreferent Characters0
Alfie: Democratising RGBA Image Generation With No $Code2
Semantic Alignment for Multimodal Large Language Models0
Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models0
Context-aware Visual Storytelling with Visual Prefix Tuning and Contrastive Learning0
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual StorytellingCode1
ContextualStory: Consistent Visual Storytelling with Spatially-Enhanced and Storyline ContextCode0
Not (yet) the whole story: Evaluating Visual Storytelling Requires More than Measuring Coherence, Grounding, and RepetitionCode0
Improving Visual Storytelling with Multimodal Large Language Models0
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and GenerationCode2
Gorgeous: Create Your Desired Character Facial Makeup from Any IdeasCode1
TARN-VIST: Topic Aware Reinforcement Network for Visual Storytelling0
AesopAgent: Agent-driven Evolutionary System on Story-to-Video Production0
Metamorpheus: Interactive, Affective, and Creative Dream Narration Through Metaphorical Visual Storytelling0
SCO-VIST: Social Interaction Commonsense Knowledge-based Visual Storytelling0
Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion ModelsCode3
MagicScroll: Nontypical Aspect-Ratio Image Generation for Visual Storytelling via Multi-Layered Semantic-Aware Denoising0
DiffuVST: Narrating Fictional Scenes with Global-History-Guided Denoising Models0
inkn'hue: Enhancing Manga Colorization from Multiple Priors with Alignment Multi-Encoder VAECode1
GROOViST: A Metric for Grounding Objects in Visual StorytellingCode0
Visual Storytelling with Question-Answer Plans0
Envisioning Narrative Intelligence: A Creative Visual Storytelling AnthologyCode0
Comics for Everyone: Generating Accessible Text Descriptions for Comic Strips0
TouchStone: Evaluating Vision-Language Models by Language ModelsCode1
Text-Only Training for Visual Storytelling0
Animate-A-Story: Storytelling with Retrieval-Augmented Video GenerationCode2
AOG-LSTM: An adaptive attention neural network for visual storytelling0
Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion ModelsCode2
Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings0
Visual Transformation TellingCode0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GLAC NetMETEOR30.14Unverified
2HEGRBLEU-416.7Unverified
3HBSGBLEU-415.4Unverified
4IRWBLEU-415.4Unverified
5CoVSBLEU-415.2Unverified
6SGEmbBLEU-414.8Unverified
7SentiStoryBLEU-414.8Unverified
8SGVSTBLEU-414.7Unverified
9INetBLEU-414.7Unverified
10TAVST (RL)BLEU-414.6Unverified