SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 54015450 of 6689 papers

TitleStatusHype
Directional GAN: A Novel Conditioning Strategy for Generative Networks0
Direction-Aware Diagonal Autoregressive Image Generation0
Disability Representations: Finding Biases in Automatic Image Generation0
Boost Your Human Image Generation Model via Direct Preference Optimization0
Boosting Unconstrained Face Recognition with Targeted Style Adversary0
DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis0
Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning0
Discovering Class-Specific GAN Controls for Semantic Image Synthesis0
Artificial Intelligence for Pediatric Ophthalmology0
SVFit: Parameter-Efficient Fine-Tuning of Large Pre-Trained Models Using Singular Values0
Discrepancy-based Diffusion Models for Lesion Detection in Brain MRI0
Discrete Modeling via Boundary Conditional Diffusion Processes0
Discrete Predictor-Corrector Diffusion Models for Image Synthesis0
Discrete Spatial Diffusion: Intensity-Preserving Diffusion Modeling0
Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning0
Discriminative Consistent Domain Generation for Semi-supervised Learning0
SVS-GAN: Leveraging GANs for Semantic Video Synthesis0
Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes0
Discriminative Image Generation with Diffusion Models for Zero-Shot Learning0
Discriminative Probing and Tuning for Text-to-Image Generation0
Discriminator Contrastive Divergence: Semi-Amortized Generative Modeling by Exploring Energy of the Discriminator0
VistaDepth: Frequency Modulation With Bias Reweighting For Enhanced Long-Range Depth Estimation0
Discriminator-Free Direct Preference Optimization for Video Diffusion0
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing0
Disentangled Diffusion Autoencoder for Harmonization of Multi-site Neuroimaging Data0
SwapText: Image Based Texts Transfer in Scenes0
Disentangled Latent Energy-Based Style Translation: An Image-Level Structural MRI Harmonization Framework0
Disentangled Representation Learning for Controllable Person Image Generation0
Disentangled Representation Learning GAN for Pose-Invariant Face Recognition0
SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules0
Disentangling Latent Factors of Variational Auto-Encoder with Whitening0
Disentangling Latent Hands for Image Synthesis and Pose Estimation0
Disentangling Regional Primitives for Image Generation0
DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation0
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation0
Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy0
SwipeGANSpace: Swipe-to-Compare Image Generation via Efficient Latent Space Exploration0
Visual Analysis of Prediction Uncertainty in Neural Networks for Deep Image Synthesis0
Distance Weighted Trans Network for Image Completion0
Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge0
Distilling GANs with Style-Mixed Triplets for X2I Translation with Limited Data0
Distilling semantically aware orders for autoregressive image generation0
Distilling Vision-Language Foundation Models: A Data-Free Approach via Prompt Diversification0
Distortion Estimation Through Explicit Modeling of the Refractive Surface0
Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence0
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis0
Distribution-Conditional Generation: From Class Distribution to Creative Generation0
DiT4Edit: Diffusion Transformer for Image Editing0
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution0
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation0
Show:102550
← PrevPage 109 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified