Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2051–2100 of 6689 papers

Title	Date	Tasks	Status	Hype
What's Next? Exploring Utilization, Challenges, and Future Directions of AI-Generated Image Tools in Graphic Design	Jun 19, 2024	Image Generation	—Unverified	0
DF40: Toward Next-Generation Deepfake Detection	Jun 19, 2024	DeepFake DetectionFace Reenactment	CodeCode Available	3
Improving Visual Commonsense in Language Models via Multiple Image Generation	Jun 19, 2024	Common Sense ReasoningImage Generation	CodeCode Available	1
Training Diffusion Models with Federated Learning	Jun 18, 2024	DenoisingFederated Learning	—Unverified	0
Cyclic 2.5D Perceptual Loss for Cross-Modal 3D Medical Image Synthesis: T1w MRI to Tau PET	Jun 18, 2024	Image GenerationSSIM	CodeCode Available	0
AITTI: Learning Adaptive Inclusive Token for Text-to-Image Generation	Jun 18, 2024	AttributeFairness	CodeCode Available	1
ARTIST: Improving the Generation of Text-rich Images with Disentangled Diffusion Models and Large Language Models	Jun 17, 2024	DisentanglementImage Generation	—Unverified	0
Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models	Jun 17, 2024	AllContrastive Learning	CodeCode Available	1
Decomposed evaluations of geographic disparities in text-to-image models	Jun 17, 2024	AttributeDiversity	—Unverified	0
GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation	Jun 17, 2024	Image GenerationMath	CodeCode Available	0
PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models	Jun 17, 2024	Image Generation	—Unverified	0
Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes	Jun 17, 2024	Data AugmentationImage Generation	—Unverified	0
Generative Visual Instruction Tuning	Jun 17, 2024	Image GenerationImage-text matching	CodeCode Available	0
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%	Jun 17, 2024	image-classificationImage Classification	CodeCode Available	2
Autoregressive Image Generation without Vector Quantization	Jun 17, 2024	Image GenerationQuantization	CodeCode Available	5
Latent Denoising Diffusion GAN: Faster sampling, Higher image quality	Jun 17, 2024	DenoisingDiversity	CodeCode Available	1
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models	Jun 17, 2024	DecoderImage Generation	—Unverified	0
Mixture-of-Subspaces in Low-Rank Adaptation	Jun 16, 2024	Common Sense ReasoningImage Generation	CodeCode Available	0
STAR: Scale-wise Text-to-image generation via Auto-Regressive representations	Jun 16, 2024	DiversityImage Generation	CodeCode Available	2
An Analysis on Quantizing Diffusion Transformers	Jun 16, 2024	Conditional Image GenerationDenoising	—Unverified	0
Can Generative AI Replace Immunofluorescent Staining Processes? A Comparison Study of Synthetically Generated CellPainting Images from Brightfield	Jun 15, 2024	Image Generation	—Unverified	0
Poetry2Image: An Iterative Correction Framework for Images Generated from Chinese Classical Poetry	Jun 15, 2024	Image GenerationText to Image Generation	—Unverified	0
MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation	Jun 15, 2024	AudioCapsImage Generation	CodeCode Available	0
Make It Count: Text-to-Image Generation with an Accurate Number of Objects	Jun 14, 2024	DenoisingImage Generation	CodeCode Available	2
Crafting Parts for Expressive Object Composition	Jun 14, 2024	DenoisingImage Generation	—Unverified	0
ControlVAR: Exploring Controllable Visual Autoregressive Modeling	Jun 14, 2024	Image Generation	CodeCode Available	2
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation	Jun 13, 2024	GPUImage Generation	—Unverified	0
Understanding Hallucinations in Diffusion Models through Mode Interpolation	Jun 13, 2024	HallucinationImage Generation	CodeCode Available	2
StableMaterials: Enhancing Diversity in Material Generation via Semi-Supervised Learning	Jun 13, 2024	DiversityImage Generation	—Unverified	0
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels	Jun 13, 2024	Image GenerationInductive Bias	—Unverified	0
Batch-Instructed Gradient for Prompt Evolution:Systematic Prompt Optimization for Enhanced Text-to-Image Synthesis	Jun 13, 2024	Image GenerationText to Image Generation	CodeCode Available	0
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models and Time-Dependent Layer Normalization	Jun 13, 2024	Image Generation	CodeCode Available	1
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts	Jun 13, 2024	Conditional Image GenerationImage Generation	CodeCode Available	5
TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation	Jun 12, 2024	BenchmarkingImage Generation	CodeCode Available	1
WMAdapter: Adding WaterMark Control to Latent Diffusion Models	Jun 12, 2024	Image GenerationTransfer Learning	—Unverified	0
Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation	Jun 12, 2024	Image GenerationPerceptual Distance	—Unverified	0
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation	Jun 12, 2024	Image GenerationText to Image Generation	—Unverified	0
DiTFastAttn: Attention Compression for Diffusion Transformer Models	Jun 12, 2024	2kImage Generation	—Unverified	0
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks	Jun 12, 2024	Image GenerationLanguage Modeling	CodeCode Available	5
What If We Recaption Billions of Web Images with LLaMA-3?	Jun 12, 2024	Cross-Modal RetrievalImage Generation	—Unverified	0
Understanding and Mitigating Compositional Issues in Text-to-Image Generative Models	Jun 12, 2024	Image Generation	CodeCode Available	0
CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models	Jun 12, 2024	Image Generationtext-guided-generation	CodeCode Available	1
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models	Jun 12, 2024	Continual LearningImage Generation	—Unverified	0
Progress Towards Decoding Visual Imagery via fNIRS	Jun 11, 2024	Image GenerationImage Reconstruction	—Unverified	0
Image and Video Tokenization with Binary Spherical Quantization	Jun 11, 2024	DecoderImage Generation	CodeCode Available	3
An Image is Worth 32 Tokens for Reconstruction and Generation	Jun 11, 2024	Image GenerationImage Reconstruction	CodeCode Available	3
Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions	Jun 11, 2024	HallucinationImage Description	CodeCode Available	2
SPIN: Spacecraft Imagery for Navigation	Jun 11, 2024	Data AugmentationImage Generation	CodeCode Available	1
Beware of Aliases -- Signal Preservation is Crucial for Robust Image Restoration	Jun 11, 2024	DecoderImage Generation	—Unverified	0
Understanding Visual Concepts Across Models	Jun 11, 2024	Image Generationobject-detection	CodeCode Available	0

Show:10 25 50

← PrevPage 42 of 134Next →

All datasets ImageNet 256x256 CIFAR-10 ImageNet 64x64 ImageNet 512x512 FFHQ 256 x 256 CelebA 64x64 ImageNet 32x32 LSUN Bedroom 256 x 256 STL-10 LSUN Churches 256 x 256 ImageNet 128x128 FFHQ 1024 x 1024

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Improved DDPM	FID	12.3	—	Unverified
2	ADM	FID	11.84	—	Unverified
3	BigGAN-deep	FID	8.1	—	Unverified
4	Polarity-BigGAN	FID	6.82	—	Unverified
5	VQGAN+Transformer (k=mixed, p=1.0, a=0.005)	FID	6.59	—	Unverified
6	MaskGIT	FID	6.18	—	Unverified
7	VQGAN+Transformer (k=600, p=1.0, a=0.05)	FID	5.2	—	Unverified
8	CDM	FID	4.88	—	Unverified
9	ADM-G	FID	4.59	—	Unverified
10	RIN	FID	4.51	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PresGAN	FID	52.2	—	Unverified
2	RESFLOW	FID	48.29	—	Unverified
3	Residual Flow	FID	46.37	—	Unverified
4	GLF+perceptual loss (ours)	FID	44.6	—	Unverified
5	ProdPoly no activation functions	FID	40.45	—	Unverified
6	ProdPoly no activation functions	FID	36.77	—	Unverified
7	ACGAN	FID	35.47	—	Unverified
8	DenseFlow-74-10	FID	34.9	—	Unverified
9	NVAE w/ flow	FID	32.53	—	Unverified
10	QSNGAN	FID	31.97	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	GLIDE + CLS	FID	30.87	—	Unverified
2	GLIDE + CLIP	FID	30.46	—	Unverified
3	GLIDE + CLS-FREE	FID	29.22	—	Unverified
4	GLIDE + CLIP + CLS + CLS-FREE	FID	29.18	—	Unverified
5	PGMGAN	FID	21.73	—	Unverified
6	CLR-GAN	FID	20.27	—	Unverified
7	FM	FID	14.45	—	Unverified
8	CT (Direct Generation, NFE=1)	FID	13	—	Unverified
9	CT (Direct Generation, NFE=2)	FID	11.1	—	Unverified
10	GLIDE +CLS	KID	7.95	—	Unverified