SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 34513500 of 6689 papers

TitleStatusHype
A Multimodal Visual Encoding Model Aided by Introducing Verbal Semantic Information0
Identifying and Mitigating the Security Risks of Generative AI0
RobustCLEVR: A Benchmark and Framework for Evaluating Robustness in Object-centric Learning0
Pixel-Aware Stable Diffusion for Realistic Image Super-resolution and Personalized StylizationCode3
Bi-Modality Medical Image Synthesis Using Semi-Supervised Sequential Generative Adversarial NetworksCode0
ORES: Open-vocabulary Responsible Visual SynthesisCode1
DiffI2I: Efficient Diffusion Model for Image-to-Image Translation0
Arbitrary Distributions Mapping via SyMOT-Flow: A Flow-based Approach Integrating Maximum Mean Discrepancy and Optimal Transport0
Residual Denoising Diffusion ModelsCode2
Is Deep Learning Network Necessary for Image Generation?0
Diff-Retinex: Rethinking Low-light Image Enhancement with A Generative Diffusion Model0
A Survey of Diffusion Based Image Generation Models: Issues and Their Solutions0
Dense Text-to-Image Generation with Attention ModulationCode2
Learning to Decouple and Generate Seismic Random Noise via Invertible Neural NetworkCode0
CoC-GAN: Employing Context Cluster for Unveiling a New Pathway in Image Generation0
DISGAN: Wavelet-informed Discriminator Guides GAN to MRI Super-resolution with Noise CleaningCode0
LFS-GAN: Lifelong Few-Shot Image GenerationCode1
Large Multilingual Models Pivot Zero-Shot Multimodal Learning across LanguagesCode6
Manipulating Embeddings of Stable Diffusion PromptsCode1
Augmenting medical image classifiers with synthetic data from latent diffusion models0
Efficient Transfer Learning in Diffusion Models via Adversarial Noise0
Adversarial Illusions in Multi-Modal EmbeddingsCode1
Open Set Synthetic Image Source Attribution0
Semantic RGB-D Image Synthesis0
MosaiQ: Quantum Generative Adversarial Networks for Image Generation on NISQ Computers0
Hyperspectral Blind Unmixing using a Double Deep Image PriorCode0
Debiasing Counterfactuals In the Presence of Spurious Correlations0
Sampling From Autoencoders' Latent Space via Quantization And Probability Mass Function Concepts0
Spiking-Diffusion: Vector Quantized Discrete Diffusion Model with Spiking Neural NetworksCode1
SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation0
I3: Intent-Introspective Retrieval Conditioned on Instructions0
ControlCom: Controllable Image Composition using Diffusion ModelCode1
ASPIRE: Language-Guided Data Augmentation for Improving Robustness Against Spurious CorrelationsCode0
DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability0
RFDforFin: Robust Deep Forgery Detection for GAN-generated Fingerprint Images0
Guide3D: Create 3D Avatars from Text and Image Guidance0
Watch Your Steps: Local Image and Scene Editing by Text Instructions0
Painter: Teaching Auto-regressive Language Models to Draw Sketches0
Diff-CAPTCHA: An Image-based CAPTCHA with Security Enhanced by Denoising Diffusion Model0
Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image SynthesisCode1
Likelihood-Based Text-to-Image Evaluation with Patch-Level Perceptual and Semantic Credit AssignmentCode0
Denoising Diffusion Probabilistic Model for Retinal Image Generation and SegmentationCode1
SGDiff: A Style Guided Diffusion Model for Fashion SynthesisCode1
Story Visualization by Online Text Augmentation with Context MemoryCode1
Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without TrainingCode1
SciRE-Solver: Accelerating Diffusion Models Sampling by Score-integrand Solver with Recursive Difference0
MarkovGen: Structured Prediction for Efficient Text-to-Image Generation0
Semantic-aware Network for Aerial-to-Ground Image SynthesisCode0
Bayesian Flow NetworksCode2
LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts0
Show:102550
← PrevPage 70 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified