| Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models | Jan 1, 2024 | Image GenerationText to Image Generation | CodeCode Available | 3 | 5 |
| FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations | Nov 16, 2024 | Visual Storytelling | CodeCode Available | 3 | 5 |
| Alfie: Democratising RGBA Image Generation With No $ | Aug 27, 2024 | Image GenerationImage Matting | CodeCode Available | 2 | 5 |
| CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation | Jun 15, 2024 | In-Context LearningText Generation | CodeCode Available | 2 | 5 |
| Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models | Jun 1, 2023 | Image GenerationStory Visualization | CodeCode Available | 2 | 5 |
| Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation | Jul 13, 2023 | RetrievalVideo Generation | CodeCode Available | 2 | 5 |
| Gorgeous: Create Your Desired Character Facial Makeup from Any Ideas | Apr 22, 2024 | Visual Storytelling | CodeCode Available | 1 | 5 |
| inkn'hue: Enhancing Manga Colorization from Multiple Priors with Alignment Multi-Encoder VAE | Nov 3, 2023 | ColorizationVisual Storytelling | CodeCode Available | 1 | 5 |
| StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation | May 15, 2025 | Face RecognitionObject | CodeCode Available | 1 | 5 |
| Positional Diffusion: Ordering Unordered Sets with Diffusion Probabilistic Models | Mar 20, 2023 | Graph Neural NetworkSentence | CodeCode Available | 1 | 5 |
| TouchStone: Evaluating Vision-Language Models by Language Models | Aug 31, 2023 | Visual Storytelling | CodeCode Available | 1 | 5 |
| Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling | Aug 7, 2024 | Image GenerationLanguage Modelling | CodeCode Available | 1 | 5 |
| Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and Reasoning | May 31, 2022 | Common Sense ReasoningGraph Generation | CodeCode Available | 1 | 5 |
| Plot and Rework: Modeling Storylines for Visual Storytelling | May 14, 2021 | DiversityForm | CodeCode Available | 1 | 5 |
| Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and Beyond | Oct 8, 2024 | Question AnsweringVisual Question Answering | CodeCode Available | 0 | 5 |
| No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling | Apr 24, 2018 | Image CaptioningReinforcement Learning | CodeCode Available | 0 | 5 |
| Knowledge-Enriched Visual Storytelling | Dec 3, 2019 | Knowledge GraphsStory Generation | CodeCode Available | 0 | 5 |
| Knowledgeable Storyteller: A Commonsense-Driven Generative Model for Visual Storytelling | May 4, 2019 | AI AgentKnowledge Graphs | CodeCode Available | 0 | 5 |
| Learning to Rank Visual Stories From Human Ranking Data | May 1, 2022 | Learning-To-RankText Generation | CodeCode Available | 0 | 5 |
| Not (yet) the whole story: Evaluating Visual Storytelling Requires More than Measuring Coherence, Grounding, and Repetition | Jul 5, 2024 | Visual GroundingVisual Storytelling | CodeCode Available | 0 | 5 |
| Contextualize, Show and Tell: A Neural Visual Storyteller | Jun 3, 2018 | DecoderImage Description | CodeCode Available | 0 | 5 |
| GROOViST: A Metric for Grounding Objects in Visual Storytelling | Oct 26, 2023 | Visual GroundingVisual Storytelling | CodeCode Available | 0 | 5 |
| Informative Visual Storytelling with Cross-modal Rules | Jul 7, 2019 | DecoderStory Generation | CodeCode Available | 0 | 5 |
| Detecting and Grounding Important Characters in Visual Stories | Mar 30, 2023 | Visual Storytelling | CodeCode Available | 0 | 5 |
| Consistent Story Generation with Asymmetry Zigzag Sampling | Jun 11, 2025 | Image GenerationStory Generation | CodeCode Available | 0 | 5 |