SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 54515500 of 6689 papers

TitleStatusHype
LDFaceNet: Latent Diffusion-based Network for High-Fidelity Deepfake Generation0
DiTFastAttn: Attention Compression for Diffusion Transformer Models0
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers0
Ditto: Accelerating Diffusion Model via Temporal Value Similarity0
Symmetrical Flow Matching: Unified Image Generation, Segmentation, and Classification with Score-Based Generative Models0
DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder0
Arbitrary Distributions Mapping via SyMOT-Flow: A Flow-based Approach Integrating Maximum Mean Discrepancy and Optimal Transport0
DivCon: Divide and Conquer for Progressive Text-to-Image Generation0
DiverGAN: An Efficient and Effective Single-Stage Framework for Diverse Text-to-Image Generation0
Diverse and Tailored Image Generation for Zero-shot Multi-label Classification0
Diverse Diffusion: Enhancing Image Diversity in Text-to-Image Generation0
DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows0
Visual Car Brand Classification by Implementing a Synthetic Image Dataset Creation Pipeline0
SyncDiffusion: Coherent Montage via Synchronized Joint Diffusions0
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning0
Diverse Single Image Generation with Controllable Global Structure0
SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D Scene Editing0
Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images0
Diversity and Diffusion: Observations on Synthetic Image Distributions with Stable Diffusion0
SynNet: Structure-Preserving Fully Convolutional Networks for Medical Image Synthesis0
Diversity Regularized Adversarial Learning0
DIVE: Taming DINO for Subject-Driven Video Editing0
SYNOSIS: Image synthesis pipeline for machine vision in metal surface inspection0
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation0
Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback0
Synth^2: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings0
Synthesis and Edition of Ultrasound Images via Sketch Guided Progressive Growing GANs0
DOCCI: Descriptions of Connected and Contrasting Images0
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model0
Do DALL-E and Flamingo Understand Each Other?0
Do Diffusion Models Learn Semantically Meaningful and Efficient Representations?0
On Error Propagation of Diffusion Models0
Do Distributed Semantic Models Dream of Electric Sheep? Visualizing Word Representations through Image Synthesis0
Does CLIP perceive art the same way we do?0
Do I look like a `cat.n.01` to you? A Taxonomy Image Generation Benchmark0
Domain Adaptation Using Adversarial Learning for Autonomous Navigation0
Synthesis of Annotated Colorectal Cancer Tissue Images from Gland Layout0
Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models0
Domain Generalization for 6D Pose Estimation Through NeRF-based Image Synthesis0
Visual Conceptual Blending with Large-scale Language and Vision Models0
Do Not Mask What You Do Not Need to Mask: a Parser-Free Virtual Try-On0
Don't Forget your Inverse DDIM for Image Editing0
DoodleFormer: Creative Sketch Drawing with Transformers0
DOTE: Dual cOnvolutional filTer lEarning for Super-Resolution and Cross-Modality Synthesis in MRI0
DPAF: Image Synthesis via Differentially Private Aggregation in Forward Phase0
DPCL-Diff: The Temporal Knowledge Graph Reasoning Based on Graph Node Diffusion Model with Dual-Domain Periodic Contrastive Learning0
DPDEdit: Detail-Preserved Diffusion Models for Multimodal Fashion Image Editing0
Synthesis of High-Quality Visible Faces from Polarimetric Thermal Faces using Generative Adversarial Networks0
Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings0
Dr.3D: Adapting 3D GANs to Artistic Drawings0
Show:102550
← PrevPage 110 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified