| Using Inter-Sentence Diverse Beam Search to Reduce Redundancy in Visual Storytelling | May 30, 2018 | Image to textSentence | —Unverified | 0 | 0 |
| A System for Image Understanding using Sensemaking and Narrative | Jan 21, 2022 | Visual Storytelling | —Unverified | 0 | 0 |
| A survey on knowledge-enhanced multimodal learning | Nov 19, 2022 | Conditional Image GenerationFactual Visual Question Answering | —Unverified | 0 | 0 |
| A Pipeline for Creative Visual Storytelling | Jul 21, 2018 | Visual Storytelling | —Unverified | 0 | 0 |
| JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent | Jun 21, 2025 | Instruction FollowingLarge Language Model | —Unverified | 0 | 0 |
| KAHANI: Culturally-Nuanced Visual Storytelling Pipeline for Non-Western Cultures | Oct 25, 2024 | Story GenerationVisual Storytelling | —Unverified | 0 | 0 |
| Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication | Nov 11, 2019 | Image CaptioningQuestion Generation | —Unverified | 0 | 0 |
| VinaBench: Benchmark for Faithful and Consistent Visual Narratives | Mar 26, 2025 | Visual Storytelling | —Unverified | 0 | 0 |
| Knowledge-enriched Attention Network with Group-wise Semantic for Visual Storytelling | Mar 10, 2022 | DecoderStory Generation | —Unverified | 0 | 0 |
| Vision Transformer Based Model for Describing a Set of Images as a Story | Oct 6, 2022 | Language ModellingSentence | —Unverified | 0 | 0 |
| Learning to Rank Visual Stories From Human Ranking Data | Nov 16, 2021 | Learning-To-RankText Generation | —Unverified | 0 | 0 |
| VIST-GPT: Ushering in the Era of Visual Storytelling with LLMs? | Apr 27, 2025 | Visual GroundingVisual Storytelling | —Unverified | 0 | 0 |
| LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers | May 29, 2025 | DenoisingImage Generation | —Unverified | 0 | 0 |
| MagicScroll: Nontypical Aspect-Ratio Image Generation for Visual Storytelling via Multi-Layered Semantic-Aware Denoising | Dec 18, 2023 | DenoisingImage Generation | —Unverified | 0 | 0 |
| Metamorpheus: Interactive, Affective, and Creative Dream Narration Through Metaphorical Visual Storytelling | Mar 1, 2024 | ARCVisual Storytelling | —Unverified | 0 | 0 |
| MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks | Mar 24, 2025 | Visual Storytelling | —Unverified | 0 | 0 |
| Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings | May 3, 2023 | Data AugmentationQuestion Answering | —Unverified | 0 | 0 |
| "My Way of Telling a Story": Persona based Grounded Story Generation | Jun 14, 2019 | DecoderStory Generation | —Unverified | 0 | 0 |
| ``My Way of Telling a Story'': Persona based Grounded Story Generation | Aug 1, 2019 | DecoderStory Generation | —Unverified | 0 | 0 |
| Neural Event Extraction from Movies Description | Jun 1, 2018 | Event ExtractionMachine Translation | —Unverified | 0 | 0 |
| Visual Storytelling with Question-Answer Plans | Oct 8, 2023 | Visual Storytelling | —Unverified | 0 | 0 |
| A-CAP: Anticipation Captioning with Commonsense Knowledge | Apr 13, 2023 | Image CaptioningLanguage Modeling | —Unverified | 0 | 0 |
| On How Users Edit Computer-Generated Visual Stories | Feb 22, 2019 | ArticlesDiversity | —Unverified | 0 | 0 |
| AOG-LSTM: An adaptive attention neural network for visual storytelling | Jun 26, 2023 | DecoderVisual Storytelling | —Unverified | 0 | 0 |
| A Hierarchical Approach for Visual Storytelling Using Image Description | Sep 26, 2019 | DecoderImage Description | —Unverified | 0 | 0 |
| AesopAgent: Agent-driven Evolutionary System on Story-to-Video Production | Mar 12, 2024 | Image GenerationRAG | —Unverified | 0 | 0 |
| Reading Between the Lines: Exploring Infilling in Visual Narratives | Oct 26, 2020 | Visual Storytelling | —Unverified | 0 | 0 |
| RoViST: Learning Robust Metrics for Visual Storytelling | Dec 17, 2021 | SentenceText Generation | —Unverified | 0 | 0 |
| Visual Writing Prompts: Character-Grounded Story Generation with Curated Image Sequences | Jan 20, 2023 | Coherence EvaluationGrounded language learning | —Unverified | 0 | 0 |
| Visual Storytelling via Predicting Anchor Word Embeddings in the Stories | Jan 13, 2020 | Visual StorytellingWord Embeddings | —Unverified | 0 | 0 |
| SCO-VIST: Social Interaction Commonsense Knowledge-based Visual Storytelling | Feb 1, 2024 | DiversityImage Captioning | —Unverified | 0 | 0 |
| Semantic Alignment for Multimodal Large Language Models | Aug 23, 2024 | Large Language ModelVisual Storytelling | —Unverified | 0 | 0 |
| SentiStory: A Multi-Layered Sentiment-Aware Generative Model for Visual Storytelling | Jun 16, 2022 | Visual Storytelling | —Unverified | 0 | 0 |
| Shape2Animal: Creative Animal Generation from Natural Silhouettes | Jun 25, 2025 | Visual Storytelling | —Unverified | 0 | 0 |
| Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models | Aug 21, 2024 | Logical ReasoningMotion Synthesis | —Unverified | 0 | 0 |
| Storybooth: Training-free Multi-Subject Consistency for Improved Visual Storytelling | Apr 8, 2025 | Image GenerationText to Image Generation | —Unverified | 0 | 0 |
| Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts | May 22, 2025 | Dialogue GenerationLarge Language Model | —Unverified | 0 | 0 |
| Storytelling from an Image Stream Using Scene Graphs | Apr 3, 2020 | Story GenerationVisual Storytelling | —Unverified | 0 | 0 |
| Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent Neural Network | Jun 2, 2016 | Video CaptioningVisual Storytelling | —Unverified | 0 | 0 |
| Stretch-VST: Getting Flexible With Visual Stories | Aug 1, 2021 | SentenceVisual Storytelling | —Unverified | 0 | 0 |
| TARN-VIST: Topic Aware Reinforcement Network for Visual Storytelling | Mar 18, 2024 | Image CaptioningVisual Storytelling | —Unverified | 0 | 0 |
| Visual Storytelling with Hierarchical BERT Semantic Guidance | Jan 10, 2022 | SentenceText Generation | —Unverified | 0 | 0 |
| Text-Only Training for Visual Storytelling | Aug 17, 2023 | DiversityInformativeness | —Unverified | 0 | 0 |
| The Steep Road to Happily Ever After: An Analysis of Current Visual Storytelling Models | Apr 6, 2019 | SurveyVisual Storytelling | —Unverified | 0 | 0 |
| Coherent Visual Storytelling via Parallel Top-Down Visual and Topic Attention | Aug 17, 2022 | DiversitySentence | —Unverified | 0 | 0 |
| Comics for Everyone: Generating Accessible Text Descriptions for Comic Strips | Oct 1, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Character-Centric Storytelling | Sep 17, 2019 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Commonsense Knowledge Aware Concept Selection For Diverse and Informative Visual Storytelling | Feb 5, 2021 | DiversityInformativeness | —Unverified | 0 | 0 |
| Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling | Aug 11, 2020 | Meta-LearningVisual Storytelling | —Unverified | 0 | 0 |
| Context-aware Visual Storytelling with Visual Prefix Tuning and Contrastive Learning | Aug 12, 2024 | Contrastive LearningInformativeness | —Unverified | 0 | 0 |