SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 43014350 of 6689 papers

TitleStatusHype
Generator Born from Classifier0
FaceStudio: Put Your Face Everywhere in Seconds0
Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images0
ResEnsemble-DDPM: Residual Denoising Diffusion Probabilistic Models for Ensemble Learning0
A Contrastive Compositional Benchmark for Text-to-Image Synthesis: A Study with Unified Text-to-Image Fidelity MetricsCode0
InstructBooth: Instruction-following Personalized Text-to-Image Generation0
A multi-channel cycleGAN for CBCT to CT synthesis0
CT Reconstruction using Diffusion Posterior Sampling conditioned on a Nonlinear Measurement Model0
Stable Messenger: Steganography for Message-Concealed Image Generation0
Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models0
Token Fusion: Bridging the Gap between Token Pruning and Token Merging0
Generating Images of the M87* Black Hole Using GANsCode0
LDM-ISP: Enhancing Neural ISP for Low Light with Latent Diffusion Models0
Ultra-Resolution Cascaded Diffusion Model for Gigapixel Image Synthesis in Histopathology0
Enhancing Diffusion Models with 3D Perspective Geometry Constraints0
Pipeline Enabling Zero-shot Classification for Bangla Handwritten Grapheme0
Generative models for visualising abstract social processes: Guiding streetview image synthesis of StyleGAN2 with indices of deprivation0
DFU: scale-robust diffusion model for zero-shot super-resolution image generationCode0
HiPA: Enabling One-Step Text-to-Image Diffusion Models via High-Frequency-Promoting Adaptation0
Few-shot Image Generation via Style Adaptation and Content Preservation0
Detailed Human-Centric Text Description-Driven Large Scene Synthesis0
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation0
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation0
Anatomy and Physiology of Artificial Intelligence in PET Imaging0
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models0
Layered Rendering Diffusion Model for Controllable Zero-Shot Image SynthesisCode0
Diffusion Models Without Attention0
ZeST-NeRF: Using temporal aggregation for Zero-Shot Temporal NeRFs0
S2ST: Image-to-Image Translation in the Seed Space of Latent Diffusion0
Non-Visible Light Data Synthesis and Application: A Case Study for Synthetic Aperture Radar Imagery0
DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback0
HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models0
Fair Text-to-Image Diffusion via Fair Mapping0
Image Inpainting via Tractable Steering of Diffusion ModelsCode0
Large Language Models Suffer From Their Own Output: An Analysis of the Self-Consuming Training Loop0
A High-Quality Robust Diffusion Framework for Corrupted DatasetCode0
Unlocking Spatial Comprehension in Text-to-Image Diffusion Models0
MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices0
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering0
Manifold Preserving Guided Diffusion0
COLE: A Hierarchical Generation Framework for Multi-Layered and Editable Graphic Design0
Denoising Diffusion Probabilistic Models for Image Inpainting of Cell Distributions in the Human Brain0
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation0
Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation0
Improving Denoising Diffusion Probabilistic Models via Exploiting Shared Representations0
Reinforcement Learning from Diffusion Feedback: Q* for Image Search0
VehicleGAN: Pair-flexible Pose Guided Image Synthesis for Vehicle Re-identification0
Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation0
Direct2.5: Diverse Text-to-3D Generation via Multi-view 2.5D Diffusion0
ET3D: Efficient Text-to-3D Generation via Multi-View Distillation0
Show:102550
← PrevPage 87 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified