SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 36013650 of 6689 papers

TitleStatusHype
Triplet-Aware Scene Graph Embeddings0
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation0
MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices0
Mobile Fitting Room: On-device Virtual Try-on via Diffusion Models0
MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices0
Triplet Synthesis For Enhancing Composed Image Retrieval via Counterfactual Image Generation0
Mod-Adapter: Tuning-Free and Versatile Multi-concept Personalization via Modulation Adapter0
Model-Agnostic Human Preference Inversion in Diffusion Models0
Model alignment using inter-modal bridges0
Model as a Game: On Numerical and Spatial Consistency for Generative Games0
Model-Based Image Signal Processors via Learnable Dictionaries0
Model Collapse Demystified: The Case of Regression0
Model Discrepancy Learning: Synthetic Faces Detection Based on Multi-Reconstruction0
Model-free Optical Processors using In Situ Reinforcement Learning with Proximal Policy Optimization0
Trojan Horse Prompting: Jailbreaking Conversational Multimodal Models by Forging Assistant Message0
cGANs with Conditional Convolution Layer0
Modeling the Nonsmoothness of Modern Neural Networks0
Approximate Caching for Efficiently Serving Diffusion Models0
Moderating the Generalization of Score-based Generative Model0
Modular Conversational Agents for Surveys and Interviews0
A Modular Deep Learning Pipeline for Galaxy-Scale Strong Gravitational Lens Detection and Modeling0
TruePose: Human-Parsing-guided Attention Diffusion for Full-ID Preserving Pose Transfer0
Modulating human brain responses via optimal natural image selection and synthetic image generation0
Modulating Pretrained Diffusion Models for Multimodal Image Synthesis0
MoEController: Instruction-based Arbitrary Image Manipulation with Mixture-of-Expert Controllers0
Truncated Consistency Models0
Zero-shot detection of buildings in mobile LiDAR using Language Vision Model0
Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation0
Mol2Image: Improved Conditional Flow Models for Molecule to Image Synthesis0
MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts0
Applications and Effect Evaluation of Generative Adversarial Networks in Semi-Supervised Learning0
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization0
Monocular Depth Estimation using Diffusion Models0
Application of Unsupervised Domain Adaptation for Structural MRI Analysis0
Appearance Matching Adapter for Exemplar-based Semantic Image Synthesis0
More Control for Free! Image Synthesis with Semantic Diffusion Guidance0
Morphological-consistent Diffusion Network for Ultrasound Coronal Image Enhancement0
MosaiQ: Quantum Generative Adversarial Networks for Image Generation on NISQ Computers0
MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation0
Trustworthy SR: Resolving Ambiguity in Image Super-resolution via Diffusion Models and Human Feedback0
Appearance Harmonization for Single Image Shadow Removal0
MouseGAN++: Unsupervised Disentanglement and Contrastive Representation for Multiple MRI Modalities Synthesis and Structural Segmentation of Mouse Brain0
MoVideo: Motion-Aware Video Generation with Diffusion Models0
MPDS: A Movie Posters Dataset for Image Generation with Diffusion Model0
What's Next? Exploring Utilization, Challenges, and Future Directions of AI-Generated Image Tools in Graphic Design0
MRI Image Generation Based on Text Prompts0
MRIS: A Multi-modal Retrieval Approach for Image Synthesis on Diverse Modalities0
MR to X-Ray Projection Image Synthesis0
MS^3D: A RG Flow-Based Regularization for GAN Training with Limited Data0
MsCGAN: Multi-scale Conditional Generative Adversarial Networks for Person Image Generation0
Show:102550
← PrevPage 73 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified