SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 56015650 of 6689 papers

TitleStatusHype
Effective Shortcut Technique for GAN0
Effect of Instance Normalization on Fine-Grained Control for Sketch-Based Face Image Generation0
Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion0
Efficient-3DiM: Learning a Generalizable Single-image Novel-view Synthesizer in One Day0
Synthetic Lung X-ray Generation through Cross-Attention and Affinity Transformation0
Efficient Cross-Lingual Transfer for Chinese Stable Diffusion with Images as Pivots0
Block and Detail: Scaffolding Sketch-to-Image Generation0
Synthetic Medical Imaging Generation with Generative Adversarial Networks For Plain Radiographs0
Efficient Diffusion Training through Parallelization with Truncated Karhunen-Loève Expansion0
Synthetic Privileged Information Enhances Medical Image Representation Learning0
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of Latent-Based Diffusion Models0
Efficient Exploration of Image Classifier Failures with Bayesian Optimization and Text-to-Image Models0
Efficient Flow Matching using Latent Variables0
Efficient Generative Modeling with Residual Vector Quantization-Based Tokens0
Efficient Hair Style Transfer with Generative Adversarial Networks0
Visual Indeterminacy in GAN Art0
SynthRAD2025 Grand Challenge dataset: generating synthetic CTs for radiotherapy0
Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion0
Efficient Multi-Instance Generation with Janus-Pro-Dirven Prompt Parsing0
Visual Lexicon: Rich Image Features in Language Space0
Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion0
Efficient Quantization Strategies for Latent Diffusion Models0
Semantically-Prompted Language Models Improve Visual Descriptions0
Efficient Scaling of Diffusion Transformers for Text-to-Image Generation0
Efficient training for future video generation based on hierarchical disentangled representation of latent variables0
Efficient Training with Denoised Neural Weights0
Efficient Transfer Learning in Diffusion Models via Adversarial Noise0
Systematic Analysis of Image Generation using GANs0
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers0
Systematic Review of Techniques in Brain Image Synthesis using Deep Learning0
Enhancing CT Image synthesis from multi-modal MRI data based on a multi-task neural network framework0
T2CI-GAN: Text to Compressed Image generation using Generative Adversarial Network0
EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation0
Hierarchy Composition GAN for High-fidelity Image Synthesis0
EINS: Long Short-Term Memory with Extrapolated Input Network Simplification0
T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation0
ELBO-T2IAlign: A Generic ELBO-Based Method for Calibrating Pixel-level Text-Image Alignment in Diffusion Models0
BlobGAN-3D: A Spatially-Disentangled 3D-Aware Generative Model for Indoor Scenes0
BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing0
ELODIN: Naming Concepts in Embedding Spaces0
Elucidating Flow Matching ODE Dynamics with Respect to Data Geometries0
Adapting Text-to-Image Generation with Feature Difference Instruction for Generic Image Restoration0
BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing0
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts0
Visual Madlibs: Fill in the blank Image Generation and Question Answering0
Emage: Non-Autoregressive Text-to-Image Generation0
T2IW: Joint Text to Image & Watermark Generation0
Emergence and Evolution of Interpretable Concepts in Diffusion Models0
Blind Motion Deblurring through SinGAN Architecture0
EM-GAN: Fast Stress Analysis for Multi-Segment Interconnect Using Generative Adversarial Networks0
Show:102550
← PrevPage 113 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified