SOTAVerified

Personalized Image Generation

Utilizes single or multiple images that contain the same subject or style, along with text prompt, to generate images that contain that subject as well as match the textual description. Includes finetuning-based methods (e.g. DreamBooth, Textual Inversion) as well as encoder-based methods (e.g. E4T, ELITE, and IP-Adapter, etc.).

Papers

Showing 2650 of 58 papers

TitleStatusHype
EZIGen: Enhancing zero-shot personalized image generation with precise subject encoding and decoupled guidanceCode2
TextBoost: Towards One-Shot Personalization of Text-to-Image Models via Fine-tuning Text EncoderCode2
ViPer: Visual Personalization of Generative Models via Individual Preference Learning0
Layout-and-Retouch: A Dual-stage Framework for Improving Diversity in Personalized Image Generation0
DreamBench++: A Human-Aligned Benchmark for Personalized Image GenerationCode2
FreeTuner: Any Subject in Any Style with Training-free Diffusion0
RectifID: Personalizing Rectified Flow with Anchored Classifier GuidanceCode2
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation0
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation0
CAT: Contrastive Adapter Training for Personalized Image GenerationCode0
MoMA: Multimodal LLM Adapter for Fast Personalized Image GenerationCode3
MM-Diff: High-Fidelity Image Personalization via Multi-Modal Condition Integration0
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models0
Fast Personalized Text-to-Image Syntheses With Attention Injection0
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept CompositionCode2
Beyond Inserting: Learning Identity Embedding for Semantic-Fidelity Personalized Diffusion Generation0
BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion ModelsCode1
When StyleGAN Meets Stable Diffusion: a W+ Adapter for Personalized Image GenerationCode2
Generative Multimodal Models are In-Context LearnersCode3
PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization0
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models0
When StyleGAN Meets Stable Diffusion: a W_+ Adapter for Personalized Image GenerationCode1
CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization0
FaceChain: A Playground for Human-centric Artificial Intelligence Generated ContentCode0
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion ModelsCode5
Show:102550
← PrevPage 2 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DreamBooth LoRA SDXL v1.0Overall (CP * PF)0.52Unverified
2IP-Adapter ViT-G SDXL v1.0Overall (CP * PF)0.38Unverified
3Emu2 SDXL v1.0Overall (CP * PF)0.36Unverified
4DreamBooth SD v1.5Overall (CP * PF)0.36Unverified
5IP-Adapter-Plus ViT-H SDXL v1.0Overall (CP * PF)0.34Unverified
6BLIP-Diffusion SD v1.5Overall (CP * PF)0.27Unverified
7Textual Inversion SD v1.5Overall (CP * PF)0.24Unverified