SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 701750 of 6689 papers

TitleStatusHype
SafeText: Safe Text-to-image Models via Aligning the Text Encoder0
DiffBrush:Just Painting the Art by Your Hands0
Gungnir: Exploiting Stylistic Features in Images for Backdoor Attacks on Diffusion ModelsCode0
FlexVAR: Flexible Visual Autoregressive Modeling without Residual PredictionCode2
Beyond Next-Token: Next-X Prediction for Autoregressive Visual GenerationCode3
Finding Local Diffusion Schrödinger Bridge using Kolmogorov-Arnold NetworkCode0
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute0
Analyzing CLIP's Performance Limitations in Multi-Object Scenarios: A Controlled High-Resolution Study0
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You ThinkCode2
Language-Informed Hyperspectral Image Synthesis for Imbalanced-Small Sample Classification via Semi-Supervised Conditional Diffusion Model0
Attention Distillation: A Unified Approach to Visual Characteristics TransferCode3
Optimal Stochastic Trace Estimation in Generative Modeling0
Multi-modal Contrastive Learning for Tumor-specific Missing Modality Synthesis0
3D Nephrographic Image Synthesis in CT Urography with the Diffusion Model and Swin Transformer0
AI-Instruments: Embodying Prompts as Instruments to Abstract & Reflect Graphical Interface Commands as General-Purpose Tools0
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image GenerationCode3
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation0
FoREST: Frame of Reference Evaluation in Spatial Reasoning TasksCode0
Training Consistency Models with Variational Noise CouplingCode1
Bayesian Optimization for Controlled Image Editing via LLMs0
Robust Polyp Detection and Diagnosis through Compositional Prompt-Guided Diffusion Models0
ASurvey: Spatiotemporal Consistency in Video Generation0
A Pragmatic Note on Evaluating Generative Models with Fréchet Inception Distance for Retinal Image Synthesis0
SynthRAD2025 Grand Challenge dataset: generating synthetic CTs for radiotherapy0
Autoregressive Image Generation Guided by Chains of Thought0
Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence0
Diffusion Models for Tabular Data: Challenges, Current Progress, and Future DirectionsCode2
Culture-TRIP: Culturally-Aware Text-to-Image Generation with Iterative Prompt Refinement0
RELICT: A Replica Detection Framework for Medical Image GenerationCode0
Fractal Generative ModelsCode5
DICEPTION: A Generalist Diffusion Model for Visual Perceptual TasksCode3
Iterative Flow Matching -- Path Correction and Gradual Refinement for Enhanced Generative Modeling0
Unified Prompt Attack Against Text-to-Image Generation Models0
High-resolution Rainy Image Synthesis: Learning from RenderingCode0
DualNeRF: Text-Driven 3D Scene Editing via Dual-Field Representation0
One-step Diffusion Models with f-Divergence Distribution Matching0
FlipConcept: Tuning-Free Multi-Concept Personalization for Text-to-Image Generation0
Multi-Agent Multimodal Models for Multicultural Text to Image GenerationCode0
Lung-DDPM: Semantic Layout-guided Diffusion Models for Thoracic CT Image SynthesisCode1
Generative Modeling of Individual Behavior at Scale0
Improving the Diffusability of Autoencoders0
DC-ControlNet: Decoupling Inter- and Intra-Element Conditions in Image Generation with Diffusion Models0
FlexTok: Resampling Images into 1D Token Sequences of Flexible Length0
MagicGeo: Training-Free Text-Guided Geometric Diagram Generation0
Flow-based generative models as iterative algorithms in probability space0
IP-Composer: Semantic Composition of Visual Concepts0
CHATS: Combining Human-Aligned Optimization and Test-Time Sampling for Text-to-Image GenerationCode2
Spherical Dense Text-to-Image Synthesis0
Flow-of-Options: Diversified and Improved LLM Reasoning by Thinking Through OptionsCode0
Personalized Image Generation with Deep Generative Models: A Decade SurveyCode3
Show:102550
← PrevPage 15 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified