SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 51100 of 6689 papers

TitleStatusHype
Video Perception Models for 3D Scene Synthesis0
EAR: Erasing Concepts from Unified Autoregressive ModelsCode0
Angio-Diff: Learning a Self-Supervised Adversarial Diffusion Model for Angiographic Geometry GenerationCode0
SoK: Can Synthetic Images Replace Real Data? A Survey of Utility and Privacy of Synthetic Image Generation0
OmniGen2: Exploration to Advanced Multimodal GenerationCode7
Morse: Dual-Sampling for Lossless Acceleration of Diffusion ModelsCode1
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image GenerationCode3
Programmable-Room: Interactive Textured 3D Room Meshes Generation Empowered by Large Language Models0
AI's Blind Spots: Geographic Knowledge and Diversity Deficit in Generated Urban Scenario0
DreamCube: 3D Panorama Generation via Multi-plane Synchronization0
The Hidden Cost of an Image: Quantifying the Energy Consumption of AI Image Generation0
Beyond Blur: A Fluid Perspective on Generative Diffusion Models0
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual TokensCode3
Category-based Galaxy Image Generation via Diffusion Models0
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models0
Watermarking Autoregressive Image GenerationCode2
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion ModelCode1
DiffusionBlocks: Blockwise Training for Generative Models via Score-Based Diffusion0
Risk Estimation of Knee Osteoarthritis Progression via Predictive Multi-task Modelling from Efficient Diffusion Model using X-ray Images0
Cost-Aware Routing for Efficient Text-To-Image Generation0
Align Your Flow: Scaling Continuous-Time Flow Map Distillation0
Decoupled Classifier-Free Guidance for Counterfactual Diffusion Models0
VideoMAR: Autoregressive Video Generatio with Continuous Tokens0
Fair Generation without Unfair Distortions: Debiasing Text-to-Image Generation with Entanglement-Free Attention0
Quantitative Comparison of Fine-Tuning Techniques for Pretrained Latent Diffusion Models in the Generation of Unseen SAR Image Concepts0
Deep Diffusion Models and Unsupervised Hyperspectral Unmixing for Realistic Abundance Map Synthesis0
Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation0
Exploring the Effectiveness of Deep Features from Domain-Specific Foundation Models in Retinal Image Synthesis0
A Watermark for Auto-Regressive Image Generation Models0
Edit360: 2D Image Edits to 3D Assets from Any Angle0
Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation0
Anatomy-Grounded Weakly Supervised Prompt Tuning for Chest X-ray Latent Diffusion Models0
High-resolution efficient image generation from WiFi CSI using a pretrained latent diffusion model0
Symmetrical Flow Matching: Unified Image Generation, Segmentation, and Classification with Score-Based Generative Models0
The Role of Generative AI in Facilitating Social Interactions: A Scoping Review0
Text to Image for Multi-Label Image Recognition with Joint Prompt-Adapter Learning0
MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning0
Marrying Autoregressive Transformer and Diffusion with Multi-Reference AutoregressionCode2
SAGE: Exploring the Boundaries of Unsafe Concept Domain with Semantic-Augment ErasingCode0
SPARKE: Scalable Prompt-Aware Diversity Guidance in Diffusion Models via RKE Score0
Prompt-Guided Latent Diffusion with Predictive Class Conditioning for 3D Prostate MRI Generation0
Only-Style: Stylistic Consistency in Image Generation without Content Leakage0
HadaNorm: Diffusion Transformer Quantization through Mean-Centered Transformations0
Ming-Omni: A Unified Multimodal Model for Perception and GenerationCode4
ELBO-T2IAlign: A Generic ELBO-Based Method for Calibrating Pixel-level Text-Image Alignment in Diffusion Models0
Consistent Story Generation with Asymmetry Zigzag SamplingCode0
Geometric Regularity in Deterministic Sampling of Diffusion-based Generative Models0
Noise Conditional Variational Score DistillationCode1
Autoregressive Semantic Visual Reconstruction Helps VLMs Understand BetterCode2
Diffuse and Disperse: Image Generation with Representation Regularization0
Show:102550
← PrevPage 2 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified