SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 20512100 of 6689 papers

TitleStatusHype
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion ModelCode1
P+: Extended Textual Conditioning in Text-to-Image GenerationCode1
Evaluating Image Hallucination in Text-to-Image Generation with Question-AnsweringCode1
Personalized Image Generation with Large Multimodal ModelsCode1
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane ReflectionsCode1
Making Convolutional Networks Shift-Invariant AgainCode1
EVALALIGN: Supervised Fine-Tuning Multimodal LLMs with Human-Aligned Data for Evaluating Text-to-Image ModelsCode1
Personalized Text-to-Image Generation with Auto-Regressive ModelsCode1
PD-GAN: Probabilistic Diverse GAN for Image InpaintingCode1
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language GenerationCode1
PATMAT: Person Aware Tuning of Mask-Aware Transformer for Face InpaintingCode1
PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation in non-English Text-to-Image GenerationCode1
ESVAE: An Efficient Spiking Variational Autoencoder with Reparameterizable Poisson Spiking SamplingCode1
Erasing Undesirable Influence in Diffusion ModelsCode1
Evaluating the feasibility of using Generative Models to generate Chest X-Ray DataCode1
Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value OptimizationCode1
SofGAN: A Portrait Image Generator with Dynamic StylingCode1
Equivariant Image ModelingCode1
DiffBoost: Enhancing Medical Image Segmentation via Text-Guided Diffusion ModelCode1
PathLDM: Text conditioned Latent Diffusion Model for HistopathologyCode1
PEPSI: Pathology-Enhanced Pulse-Sequence-Invariant Representations for Brain MRICode1
Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image EditingCode1
Personalized Visual Instruction TuningCode1
Elucidating the solution space of extended reverse-time SDE for diffusion modelsCode1
Elucidating the Exposure Bias in Diffusion ModelsCode1
Enlisting 3D Crop Models and GANs for More Data Efficient and Generalizable Fruit DetectionCode1
Elucidating the design space of language models for image generationCode1
Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion ModelsCode1
Parts of Speech-Grounded Subspaces in Vision-Language ModelsCode1
Elucidating The Design Space of Classifier-Guided Diffusion GenerationCode1
Particle Guidance: non-I.I.D. Diverse Sampling with Diffusion ModelsCode1
Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using Diffusion ModelsCode1
End-to-End Diffusion Latent Optimization Improves Classifier GuidanceCode1
Ensembling Off-the-shelf Models for GAN TrainingCode1
Partition-Guided GANsCode1
PatchDPO: Patch-level DPO for Finetuning-free Personalized Image GenerationCode1
PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMsCode1
Paragraph-to-Image Generation with Information-Enriched Diffusion ModelCode1
ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content SeparationCode1
Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-LocalizationCode1
Entropy-driven Sampling and Training Scheme for Conditional Diffusion GenerationCode1
Discovering and Mitigating Visual Biases through Keyword ExplanationCode1
EigenGAN: Layer-Wise Eigen-Learning for GANsCode1
3D^2-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar ModelingCode1
Parallel Sampling of Diffusion ModelsCode1
Patched Denoising Diffusion Models For High-Resolution Image SynthesisCode1
Enhanced Balancing GAN: Minority-class Image GenerationCode1
EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion ModelsCode1
Polarity Sampling: Quality and Diversity Control of Pre-Trained Generative Networks via Singular ValuesCode1
PTDiffusion: Free Lunch for Generating Optical Illusion Hidden Pictures with Phase-Transferred Diffusion ModelCode1
Show:102550
← PrevPage 42 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified