Story Continuation
The task involves providing an initial scene that can be obtained in real world use cases. By including this scene, a model can then copy and adapt elements from it as it generates subsequent images.
Source: StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation
Papers
Showing 1–10 of 10 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | StoryDALL-E (Story Embeddings + Cross-Attention) | FID | 36.28 | — | Unverified |
| 2 | StoryDALL-E (Cross-Attention) | FID | 35.04 | — | Unverified |
| 3 | StoryDALL-E (Story Embeddings) | FID | 29.21 | — | Unverified |
| 4 | StoryDALL-E | FID | 28.37 | — | Unverified |
| 5 | AR-LDM | FID | 19.28 | — | Unverified |
| 6 | ContextualStory | FID | 16.33 | — | Unverified |