SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 26012650 of 6689 papers

TitleStatusHype
Synth^2: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings0
Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion ModelCode3
Block-wise LoRA: Revisiting Fine-grained LoRA for Effective Personalization and Stylization in Text-to-Image Generation0
Premonition: Using Generative Models to Preempt Future Data Changes in Continual LearningCode0
Optimizing Negative Prompts for Enhanced Aesthetics and Fidelity in Text-To-Image Generation0
SSM Meets Video Diffusion Models: Efficient Long-Term Video Generation with Structured State SpacesCode1
FFAD: A Novel Metric for Assessing Generated Time Series Data Utilizing Fourier Transform and Auto-encoder0
Medical Image Synthesis via Fine-Grained Image-Text Alignment and Anatomy-Pathology Prompting0
DivCon: Divide and Conquer for Progressive Text-to-Image Generation0
Distribution-Aware Data Expansion with Diffusion ModelsCode1
Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning0
Active Generation for Image ClassificationCode0
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation0
3D-aware Image Generation and Editing with Multi-modal Conditions0
Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion ModelsCode1
VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion ModelsCode2
V_kD: Improving Knowledge Distillation using Orthogonal ProjectionsCode2
Text-Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation0
FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing0
PEPSI: Pathology-Enhanced Pulse-Sequence-Invariant Representations for Brain MRICode1
Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines0
Adaptive Multi-modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution0
Privacy-Preserving Diffusion Model Using Homomorphic EncryptionCode1
CogView3: Finer and Faster Text-to-Image Generation via Relay DiffusionCode5
DiffChat: Learning to Chat with Text-to-Image Synthesis Models for Interactive Image Creation0
Denoising Autoregressive Representation Learning0
A Data Augmentation Pipeline to Generate Synthetic Labeled Datasets of 3D Echocardiography Images using a GAN0
Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapolation0
Improving Diffusion-Based Generative Models via Approximated Optimal TransportCode0
Fine-tuning a Multiple Instance Learning Feature Extractor with Masked Context Modelling and Knowledge Distillation0
Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter ProfileCode0
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic AlignmentCode5
Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation0
Synthetic Privileged Information Enhances Medical Image Representation Learning0
StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion ModelsCode2
Evaluating Text-to-Image Generative Models: An Empirical Study on Human Image SynthesisCode0
Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation0
Discriminative Probing and Tuning for Text-to-Image Generation0
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationCode5
A spatiotemporal style transfer algorithm for dynamic visual stimulus generation0
Measuring Diversity in Co-creative Image Generation0
Investigation of the Impact of Synthetic Training Data in the Industrial Application of Terminal Strip Object Detection0
PromptCharm: Text-to-Image Generation through Multi-modal Prompting and RefinementCode1
Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer0
Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing0
FLAME Diffuser: Wildfire Image Synthesis using Mask Guided DiffusionCode1
NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and MergingCode2
ENOT: Expectile Regularization for Fast and Accurate Training of Neural Optimal Transport0
(Un)paired signal-to-signal translation with 1D conditional GANs0
Behavior Generation with Latent ActionsCode3
Show:102550
← PrevPage 53 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified