Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1676–1700 of 6689 papers

Title	Date	Tasks	Status	Hype
Using Physics Informed Generative Adversarial Networks to Model 3D porous media	Sep 17, 2024	Image Generation	—Unverified	0
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think	Sep 17, 2024	Conditional Image GenerationDepth Estimation	CodeCode Available	4
MM2Latent: Text-to-facial image generation and editing in GANs with multimodal assistance	Sep 17, 2024	Face GenerationImage Generation	CodeCode Available	1
OmniGen: Unified Image Generation	Sep 17, 2024	Edge DetectionImage Generation	CodeCode Available	7
Improving the Efficiency of Visually Augmented Language Models	Sep 17, 2024	Image GenerationImage Retrieval	CodeCode Available	0
2S-ODIS: Two-Stage Omni-Directional Image Synthesis by Geometric Distortion Correction	Sep 16, 2024	distortion correctionERP	CodeCode Available	0
On Synthetic Texture Datasets: Challenges, Creation, and Curation	Sep 16, 2024	Image GenerationTexture Synthesis	—Unverified	0
Robust image representations with counterfactual contrastive learning	Sep 16, 2024	Contrastive Learningcounterfactual	CodeCode Available	1
SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing	Sep 16, 2024	Image Generation	—Unverified	0
VAE-QWGAN: Addressing Mode Collapse in Quantum GANs via Autoencoding Priors	Sep 16, 2024	DecoderDiversity	—Unverified	0
Cross-modality image synthesis from TOF-MRA to CTA using diffusion-based models	Sep 16, 2024	DiagnosticImage Generation	—Unverified	0
MotionCom: Automatic and Motion-Aware Image Composition with LLM and Video Diffusion Prior	Sep 16, 2024	Image GenerationLanguage Modeling	CodeCode Available	0
One-Shot Learning for Pose-Guided Person Image Synthesis in the Wild	Sep 15, 2024	GPUImage Generation	CodeCode Available	1
E-Commerce Inpainting with Mask Guidance in Controlnet for Reducing Overcompletion	Sep 15, 2024	Image Generation	—Unverified	0
Generalizing Alignment Paradigm of Text-to-Image Generation with Preferences through f-divergence Minimization	Sep 15, 2024	DiversityImage Generation	—Unverified	0
GRIN: Zero-Shot Metric Depth with Pixel-Level Diffusion	Sep 15, 2024	3D ReconstructionDepth Estimation	—Unverified	0
Finetuning CLIP to Reason about Pairwise Differences	Sep 15, 2024	AttributeContrastive Learning	CodeCode Available	1
Beta-Sigma VAE: Separating beta and decoder variance in Gaussian variational autoencoder	Sep 14, 2024	DecoderImage Generation	CodeCode Available	0
Enhancing Privacy in ControlNet and Stable Diffusion via Split Learning	Sep 13, 2024	Federated LearningImage Generation	—Unverified	0
GroundingBooth: Grounding Text-to-Image Customization	Sep 13, 2024	Image Generation	—Unverified	0
InstantDrag: Improving Interactivity in Drag-based Image Editing	Sep 13, 2024	Image GenerationMotion Generation	—Unverified	0
High-Frequency Anti-DreamBooth: Robust Defense against Personalized Image Synthesis	Sep 12, 2024	Adversarial AttackAdversarial Purification	CodeCode Available	0
Scribble-Guided Diffusion for Training-free Text-to-Image Generation	Sep 12, 2024	Image GenerationText to Image Generation	CodeCode Available	1
Click2Mask: Local Editing with Dynamic Mask Generation	Sep 12, 2024	Image GenerationImage Manipulation	CodeCode Available	1
Improving Virtual Try-On with Garment-focused Diffusion Models	Sep 12, 2024	Image GenerationVirtual Try-on	CodeCode Available	1

Show:10 25 50

← PrevPage 68 of 268Next →

All datasets ImageNet 256x256 CIFAR-10 ImageNet 64x64 ImageNet 512x512 FFHQ 256 x 256 CelebA 64x64 ImageNet 32x32 LSUN Bedroom 256 x 256 STL-10 LSUN Churches 256 x 256 ImageNet 128x128 FFHQ 1024 x 1024

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Improved DDPM	FID	12.3	—	Unverified
2	ADM	FID	11.84	—	Unverified
3	BigGAN-deep	FID	8.1	—	Unverified
4	Polarity-BigGAN	FID	6.82	—	Unverified
5	VQGAN+Transformer (k=mixed, p=1.0, a=0.005)	FID	6.59	—	Unverified
6	MaskGIT	FID	6.18	—	Unverified
7	VQGAN+Transformer (k=600, p=1.0, a=0.05)	FID	5.2	—	Unverified
8	CDM	FID	4.88	—	Unverified
9	ADM-G	FID	4.59	—	Unverified
10	RIN	FID	4.51	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PresGAN	FID	52.2	—	Unverified
2	RESFLOW	FID	48.29	—	Unverified
3	Residual Flow	FID	46.37	—	Unverified
4	GLF+perceptual loss (ours)	FID	44.6	—	Unverified
5	ProdPoly no activation functions	FID	40.45	—	Unverified
6	ProdPoly no activation functions	FID	36.77	—	Unverified
7	ACGAN	FID	35.47	—	Unverified
8	DenseFlow-74-10	FID	34.9	—	Unverified
9	NVAE w/ flow	FID	32.53	—	Unverified
10	QSNGAN	FID	31.97	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	GLIDE + CLS	FID	30.87	—	Unverified
2	GLIDE + CLIP	FID	30.46	—	Unverified
3	GLIDE + CLS-FREE	FID	29.22	—	Unverified
4	GLIDE + CLIP + CLS + CLS-FREE	FID	29.18	—	Unverified
5	PGMGAN	FID	21.73	—	Unverified
6	CLR-GAN	FID	20.27	—	Unverified
7	FM	FID	14.45	—	Unverified
8	CT (Direct Generation, NFE=1)	FID	13	—	Unverified
9	CT (Direct Generation, NFE=2)	FID	11.1	—	Unverified
10	GLIDE +CLS	KID	7.95	—	Unverified