SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 26012650 of 6689 papers

TitleStatusHype
AI for Just Work: Constructing Diverse Imaginations of AI beyond "Replacing Humans"0
Synthetic Lung X-ray Generation through Cross-Attention and Affinity Transformation0
Color Alignment in Diffusion0
DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability0
ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy0
Towards More Accurate Personalized Image Generation: Addressing Overfitting and Evaluation BiasCode0
Adding Additional Control to One-Step Diffusion with Joint Distribution Matching0
Generative modelling with jump-diffusionsCode0
PixelPonder: Dynamic Patch Adaptation for Enhanced Multi-Conditional Text-to-Image Generation0
TR-DQ: Time-Rotation Diffusion Quantization0
Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image GenerationCode0
Consistent Image Layout Editing with Diffusion Models0
Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy0
VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models0
Text2Story: Advancing Video Storytelling with Text Guidance0
Anti-Diffusion: Preventing Abuse of Modifications of Diffusion-Based ModelsCode0
Frequency Autoregressive Image Generation with Continuous Tokens0
LapLoss: Laplacian Pyramid-based Multiscale loss for Image Translation0
Synthetic Data is an Elegant GIFT for Continual Vision-Language Models0
Energy-Guided Optimization for Personalized Image Editing with Pretrained Text-to-Image Diffusion Models0
Find Matching Faces Based On Face Parameters0
Multi-View Depth Consistent Image Generation Using Generative AI Models: Application on Architectural Design of University Buildings0
A Generative Approach to High Fidelity 3D Reconstruction from Text Data0
Straight-Line Diffusion Model for Efficient 3D Molecular GenerationCode0
Generative Modeling of Microweather Wind Velocities for Urban Air MobilityCode0
Q&C: When Quantization Meets Cache in Efficient Image GenerationCode0
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification0
Teaching Metric Distance to Autoregressive Multimodal Foundational Models0
Robust time series generation via Schrödinger Bridge: a comprehensive evaluation0
ACCORD: Alleviating Concept Coupling through Dependence Regularization for Text-to-Image Diffusion Personalization0
TactStyle: Generating Tactile Textures with Generative AI for Digital Fabrication0
DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models0
HanDrawer: Leveraging Spatial Information to Render Realistic Hands Using a Conditional Diffusion Model in Single Stage0
FRMD: Fast Robot Motion Diffusion with Consistency-Distilled Movement Primitives for Smooth Action Generation0
Enhancing Retinal Vessel Segmentation Generalization via Layout-Aware Generative Modelling0
Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text0
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations0
Cross Modality Medical Image Synthesis for Improving Liver Segmentation0
Evaluating and Predicting Distorted Human Body Parts for Generated ImagesCode0
Development of an Unpaired Deep Neural Network for Synthesizing X-ray Fluoroscopic Images from Digitally Reconstructed Tomography in Image Guided Radiotherapy0
DiffBrush:Just Painting the Art by Your Hands0
SafeText: Safe Text-to-image Models via Aligning the Text Encoder0
How far can we go with ImageNet for Text-to-Image generation?0
Gungnir: Exploiting Stylistic Features in Images for Backdoor Attacks on Diffusion ModelsCode0
Analyzing CLIP's Performance Limitations in Multi-Object Scenarios: A Controlled High-Resolution Study0
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute0
Finding Local Diffusion Schrödinger Bridge using Kolmogorov-Arnold NetworkCode0
Language-Informed Hyperspectral Image Synthesis for Imbalanced-Small Sample Classification via Semi-Supervised Conditional Diffusion Model0
Multi-modal Contrastive Learning for Tumor-specific Missing Modality Synthesis0
Optimal Stochastic Trace Estimation in Generative Modeling0
Show:102550
← PrevPage 53 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified