Open-Sora: Democratizing Efficient Video Production for All Dec 29, 2024 All Image Generation
Code Code Available 135 Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling Jan 29, 2025 Image Generation
Code Code Available 115 InstantID: Zero-shot Identity-Preserving Generation in Seconds Jan 15, 2024 Diffusion Personalization Diffusion Personalization Tuning Free
Code Code Available 115 HART: Efficient Visual Generation with Hybrid Autoregressive Transformer Oct 14, 2024 Image Generation Image Reconstruction
Code Code Available 95 OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on Mar 4, 2024 Denoising Image Generation
Code Code Available 95 Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis Dec 5, 2024 Image Generation
Code Code Available 95 Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Apr 3, 2024 Image Generation Image Reconstruction
Code Code Available 95 Emerging Properties in Unified Multimodal Pretraining May 20, 2025 Image Editing
Code Code Available 95 SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers Oct 14, 2024 Decoder GPU
Code Code Available 95 SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer Jan 30, 2025 Image Generation Model Compression
Code Code Available 95 InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation Apr 3, 2024 Image Generation Text to Image Generation
Code Code Available 75 Chameleon: Mixed-Modal Early-Fusion Foundation Models May 16, 2024 Image Captioning Image Generation
Code Code Available 75 Learning Multi-dimensional Human Preference for Text-to-Image Generation May 23, 2024 Image Generation Text to Image Generation
Code Code Available 75 EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture May 29, 2024 Image Generation Video Generation
Code Code Available 75 Adding Conditional Control to Text-to-Image Diffusion Models Feb 10, 2023 Image Generation Layout-to-Image Generation
Code Code Available 75 MaskSketch: Unpaired Structure-guided Masked Image Generation Feb 10, 2023 Conditional Image Generation Diversity
Code Code Available 75 Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining Aug 5, 2024 Decoder Depth Estimation
Code Code Available 75 DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation Mar 13, 2024 Image Generation Prompt Engineering
Code Code Available 75 SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration Oct 3, 2024 Image Generation Quantization
Code Code Available 75 PuLID: Pure and Lightning ID Customization via Contrastive Alignment Apr 24, 2024 Image Generation Text to Image Generation
Code Code Available 75 Improving Sample Quality of Diffusion Models Using Self-Attention Guidance Oct 3, 2022 Denoising Diversity
Code Code Available 75 Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers Jan 21, 2024 Image Generation
Code Code Available 75 In-Context LoRA for Diffusion Transformers Oct 31, 2024 Image Generation
Code Code Available 75 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity Mar 20, 2025 Image Generation
Code Code Available 75 SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization Nov 17, 2024 Image Generation Quantization
Code Code Available 75 HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer May 28, 2025 Image Generation Mixture-of-Experts
Code Code Available 75 OmniGen2: Exploration to Advanced Multimodal Generation Jun 23, 2025 Image Generation multimodal generation
Code Code Available 75 Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data Oct 24, 2024 Image Generation Question Generation
Code Code Available 75 Goku: Flow Based Video Generative Foundation Models Feb 7, 2025 Image Generation Text to Image Generation
Code Code Available 75 Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT Jun 5, 2024 Image Generation Point Cloud Generation
Code Code Available 75 OmniGen: Unified Image Generation Sep 17, 2024 Edge Detection Image Generation
Code Code Available 75 PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models Jan 10, 2024 GPU Image Generation
Code Code Available 75 Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding May 14, 2024 Image Generation Language Modeling
Code Code Available 75 Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages Aug 23, 2023 Image Generation Image to text
Code Code Available 65 StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation Dec 19, 2023 Denoising Image Generation
Code Code Available 65 Better speech synthesis through scaling May 12, 2023 Image Generation Speech Synthesis
Code Code Available 65 Pseudo Numerical Methods for Diffusion Models on Manifolds Feb 20, 2022 Denoising Image Generation
Code Code Available 65 Semi-Parametric Neural Image Synthesis Apr 25, 2022 Image Generation Retrieval
Code Code Available 65 Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models Jul 26, 2022 Image Generation Prompt Engineering
Code Code Available 65 PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding Dec 7, 2023 Diffusion Personalization Diffusion Personalization Tuning Free
Code Code Available 65 Adversarial Diffusion Distillation Nov 28, 2023 Image Generation
Code Code Available 65 Versatile Diffusion: Text, Images and Variations All in One Diffusion Model Nov 15, 2022 All Disentanglement
Code Code Available 65 Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Apr 2, 2025 Conditional Image Generation Image Generation
Code Code Available 55 Consistency Models Mar 2, 2023 Colorization Image Generation
Code Code Available 55 Magic Clothing: Controllable Garment-Driven Image Synthesis Apr 15, 2024 Image Generation
Code Code Available 55 FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification Oct 14, 2024 Image Generation
Code Code Available 55 EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts Jun 13, 2024 Conditional Image Generation Image Generation
Code Code Available 55 CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion Mar 8, 2024 Computational Efficiency Image Generation
Code Code Available 55 Fractal Generative Models Feb 24, 2025 Image Generation
Code Code Available 55 Learning Flow Fields in Attention for Controllable Person Image Generation Dec 11, 2024 Attribute Image Generation
Code Code Available 55