SOTAVerified

Image Generation

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Papers

Showing 25012550 of 6689 papers

TitleStatusHype
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation0
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models0
High Quality Diffusion Distillation on a Single GPU with Relative and Absolute Position Matching0
MMGen: Unified Multi-modal Image Generation and Understanding in One Go0
SITA: Structurally Imperceptible and Transferable Adversarial Attacks for Stylized Image GenerationCode0
Reverse Prompt: Cracking the Recipe Inside Text-to-Image Generation0
PCM : Picard Consistency Model for Fast Parallel Sampling of Diffusion Models0
Exploring Disentangled and Controllable Human Image Synthesis: From End-to-End to Stage-by-Stage0
LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object IntegrationCode0
VectorFit : Adaptive Singular & Bias Vector Fine-Tuning of Pre-trained Foundation Models0
Plug-and-Play Interpretable Responsible Text-to-Image Generation via Dual-Space Multi-facet Concept Control0
Training-free Diffusion Acceleration with Bottleneck Sampling0
PALATE: Peculiar Application of the Law of Total Expectation to Enhance the Evaluation of Deep Generative ModelsCode0
Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings0
Adoption of Watermarking Measures for AI-Generated Content and Implications under the EU AI Act0
An Image-like Diffusion Method for Human-Object Interaction Detection0
TCFG: Tangential Damping Classifier-free Guidance0
TransAnimate: Taming Layer Diffusion to Generate RGBA Video0
ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation0
Efficient Diffusion Training through Parallelization with Truncated Karhunen-Loève Expansion0
OMR-Diffusion:Optimizing Multi-Round Enhanced Training in Diffusion Models for Improved Intent Understanding0
DynASyn: Multi-Subject Personalization Enabling Dynamic Action Synthesis0
FundusGAN: A Hierarchical Feature-Aware Generative Framework for High-Fidelity Fundus Image Generation0
TDRI: Two-Phase Dialogue Refinement and Co-Adaptation for Interactive Image Generation0
D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens0
Bayesian generative models can flag performance loss, bias, and out-of-distribution image content0
Leveraging Text-to-Image Generation for Handling Spurious Correlation0
Zero-Shot Styled Text Image Generation, but Make It Autoregressive0
End-to-end Sketch-Guided Path Planning through Imitation Learning for Autonomous Mobile RobotsCode0
EDiT: Efficient Diffusion Transformers with Linear Compressed Attention0
RL4Med-DDPO: Reinforcement Learning for Controlled Guidance Towards Diverse Medical Image Generation using Vision-Language Foundation Models0
Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction0
VerbDiff: Text-Only Diffusion Models with Enhanced Interaction AwarenessCode0
FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing0
World Knowledge from AI Image Generation for Robot Control0
PromptMobile: Efficient Promptus for Low Bandwidth Mobile Video Streaming0
ATTENTION2D: Communication Efficient Distributed Self-Attention Mechanism0
LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images0
A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal StimuliCode0
Di[M]O: Distilling Masked Diffusion Models into One-step Generator0
CAM-Seg: A Continuous-valued Embedding Approach for Semantic Image GenerationCode0
Advancing Deep Learning through Probability Engineering: A Pragmatic Paradigm for Modern AI0
FP4DiT: Towards Effective Floating Point Quantization for Diffusion TransformersCode0
FetalFlex: Anatomy-Guided Diffusion Model for Flexible Control on Fetal Ultrasound Image Synthesis0
Guardians of Generation: Dynamic Inference-Time Copyright Shielding with Adaptive Guidance for AI Image Generation0
Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis0
TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models0
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers0
ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing0
DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection0
Show:102550
← PrevPage 51 of 134Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Improved DDPMFID12.3Unverified
2ADMFID11.84Unverified
3BigGAN-deepFID8.1Unverified
4Polarity-BigGANFID6.82Unverified
5VQGAN+Transformer (k=mixed, p=1.0, a=0.005)FID6.59Unverified
6MaskGITFID6.18Unverified
7VQGAN+Transformer (k=600, p=1.0, a=0.05)FID5.2Unverified
8CDMFID4.88Unverified
9ADM-GFID4.59Unverified
10RINFID4.51Unverified
#ModelMetricClaimedVerifiedStatus
1PresGANFID52.2Unverified
2RESFLOWFID48.29Unverified
3Residual FlowFID46.37Unverified
4GLF+perceptual loss (ours)FID44.6Unverified
5ProdPoly no activation functionsFID40.45Unverified
6ProdPoly no activation functionsFID36.77Unverified
7ACGANFID35.47Unverified
8DenseFlow-74-10FID34.9Unverified
9NVAE w/ flowFID32.53Unverified
10QSNGANFID31.97Unverified
#ModelMetricClaimedVerifiedStatus
1GLIDE + CLSFID30.87Unverified
2GLIDE + CLIPFID30.46Unverified
3GLIDE + CLS-FREEFID29.22Unverified
4GLIDE + CLIP + CLS + CLS-FREEFID29.18Unverified
5PGMGANFID21.73Unverified
6CLR-GANFID20.27Unverified
7FMFID14.45Unverified
8CT (Direct Generation, NFE=1)FID13Unverified
9CT (Direct Generation, NFE=2)FID11.1Unverified
10GLIDE +CLSKID7.95Unverified