SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 23512400 of 6689 papers

TitleStatusHype
Do DALL-E and Flamingo Understand Each Other?0
DOCCI: Descriptions of Connected and Contrasting Images0
Cheap-fake Detection with LLM using Prompt Engineering0
Are handcrafted filters helpful for attributing AI-generated images?0
Generative Probabilistic Image Colorization0
Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback0
ChatScratch: An AI-Augmented System Toward Autonomous Visual Programming Learning for Children Aged 6-120
Generative Image Modeling using Style and Structure Adversarial Networks0
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation0
DIVE: Taming DINO for Subject-Driven Video Editing0
ChatPainter: Improving Text to Image Generation using Dialogue0
Are generative models fair? A study of racial bias in dermatological image generation0
Generative Model for Zero-Shot Sketch-Based Image Retrieval0
Diversity Regularized Adversarial Learning0
ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting0
Diversity and Diffusion: Observations on Synthetic Image Distributions with Stable Diffusion0
Accelerating Diffusion Models via Pre-segmentation Diffusion Sampling for Medical Image Segmentation0
ChatAnything: Facetime Chat with LLM-Enhanced Personas0
Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images0
Are Conditional Latent Diffusion Models Effective for Image Restoration?0
Diverse Single Image Generation with Controllable Global Structure0
Adversarially robust segmentation models learn perceptually-aligned gradients0
Generative Image Modeling Using Spatial LSTMs0
Character Generation through Self-Supervised Vectorization0
Generative Flows with Invertible Attentions0
Are conditional GANs explicitly conditional?0
Adversarially Perturbed Wavelet-based Morphed Face Generation0
Generative Guiding Block: Synthesizing Realistic Looking Variants Capable of Even Large Change Demands0
DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows0
Diverse Diffusion: Enhancing Image Diversity in Text-to-Image Generation0
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos0
PixCell: A generative foundation model for digital histopathology images0
Diverse and Tailored Image Generation for Zero-shot Multi-label Classification0
DiverGAN: An Efficient and Effective Single-Stage Framework for Diverse Text-to-Image Generation0
DivCon: Divide and Conquer for Progressive Text-to-Image Generation0
DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder0
Ditto: Accelerating Diffusion Model via Temporal Value Similarity0
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers0
Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step0
FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency0
Generative Modeling of Individual Behavior at Scale0
DiTFastAttn: Attention Compression for Diffusion Transformer Models0
Adversarially Approximated Autoencoder for Image Generation and Manipulation0
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation0
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution0
CGOF++: Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields0
EIUP: A Training-Free Approach to Erase Non-Compliant Concepts Conditioned on Implicit Unsafe Prompts0
DiT4Edit: Diffusion Transformer for Image Editing0
CG-NeRF: Conditional Generative Neural Radiance Fields0
Adversarial Learning of Semantic Relevance in Text to Image Synthesis0
Show:102550
← PrevPage 48 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified