SOTAVerified

Personalized Image Generation

Utilizes single or multiple images that contain the same subject or style, along with text prompt, to generate images that contain that subject as well as match the textual description. Includes finetuning-based methods (e.g. DreamBooth, Textual Inversion) as well as encoder-based methods (e.g. E4T, ELITE, and IP-Adapter, etc.).

Papers

Showing 110 of 58 papers

TitleStatusHype
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual InversionCode5
Less-to-More Generalization: Unlocking More Controllability by In-Context GenerationCode5
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven GenerationCode5
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion ModelsCode5
StoryMaker: Towards Holistic Consistent Characters in Text-to-image GenerationCode4
Generative Multimodal Models are In-Context LearnersCode3
Personalized Image Generation with Deep Generative Models: A Decade SurveyCode3
MoMA: Multimodal LLM Adapter for Fast Personalized Image GenerationCode3
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept CompositionCode2
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized AttentionCode2
Show:102550
← PrevPage 1 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DreamBooth LoRA SDXL v1.0Overall (CP * PF)0.52Unverified
2IP-Adapter ViT-G SDXL v1.0Overall (CP * PF)0.38Unverified
3Emu2 SDXL v1.0Overall (CP * PF)0.36Unverified
4DreamBooth SD v1.5Overall (CP * PF)0.36Unverified
5IP-Adapter-Plus ViT-H SDXL v1.0Overall (CP * PF)0.34Unverified
6BLIP-Diffusion SD v1.5Overall (CP * PF)0.27Unverified
7Textual Inversion SD v1.5Overall (CP * PF)0.24Unverified