SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 36013625 of 6689 papers

TitleStatusHype
Triplet-Aware Scene Graph Embeddings0
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation0
MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices0
Mobile Fitting Room: On-device Virtual Try-on via Diffusion Models0
MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices0
Triplet Synthesis For Enhancing Composed Image Retrieval via Counterfactual Image Generation0
Mod-Adapter: Tuning-Free and Versatile Multi-concept Personalization via Modulation Adapter0
Model-Agnostic Human Preference Inversion in Diffusion Models0
Model alignment using inter-modal bridges0
Model as a Game: On Numerical and Spatial Consistency for Generative Games0
Model-Based Image Signal Processors via Learnable Dictionaries0
Model Collapse Demystified: The Case of Regression0
Model Discrepancy Learning: Synthetic Faces Detection Based on Multi-Reconstruction0
Model-free Optical Processors using In Situ Reinforcement Learning with Proximal Policy Optimization0
Trojan Horse Prompting: Jailbreaking Conversational Multimodal Models by Forging Assistant Message0
cGANs with Conditional Convolution Layer0
Modeling the Nonsmoothness of Modern Neural Networks0
Approximate Caching for Efficiently Serving Diffusion Models0
Moderating the Generalization of Score-based Generative Model0
Modular Conversational Agents for Surveys and Interviews0
A Modular Deep Learning Pipeline for Galaxy-Scale Strong Gravitational Lens Detection and Modeling0
TruePose: Human-Parsing-guided Attention Diffusion for Full-ID Preserving Pose Transfer0
Modulating human brain responses via optimal natural image selection and synthetic image generation0
Modulating Pretrained Diffusion Models for Multimodal Image Synthesis0
MoEController: Instruction-based Arbitrary Image Manipulation with Mixture-of-Expert Controllers0
Show:102550
← PrevPage 145 of 268Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified