Open-Sora: Democratizing Efficient Video Production for All Dec 29, 2024 All Image Generation
Code Code Available 13InstantID: Zero-shot Identity-Preserving Generation in Seconds Jan 15, 2024 Diffusion Personalization Diffusion Personalization Tuning Free
Code Code Available 11Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling Jan 29, 2025 Image Generation
Code Code Available 11OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on Mar 4, 2024 Denoising Image Generation
Code Code Available 9SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers Oct 14, 2024 Decoder GPU
Code Code Available 9SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer Jan 30, 2025 Image Generation Model Compression
Code Code Available 9Emerging Properties in Unified Multimodal Pretraining May 20, 2025 Image Editing
Code Code Available 9HART: Efficient Visual Generation with Hybrid Autoregressive Transformer Oct 14, 2024 Image Generation Image Reconstruction
Code Code Available 9Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Apr 3, 2024 Image Generation Image Reconstruction
Code Code Available 9Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis Dec 5, 2024 Image Generation
Code Code Available 9EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture May 29, 2024 Image Generation Video Generation
Code Code Available 7MaskSketch: Unpaired Structure-guided Masked Image Generation Feb 10, 2023 Conditional Image Generation Diversity
Code Code Available 7Learning Multi-dimensional Human Preference for Text-to-Image Generation May 23, 2024 Image Generation Text to Image Generation
Code Code Available 7Goku: Flow Based Video Generative Foundation Models Feb 7, 2025 Image Generation Text to Image Generation
Code Code Available 7Chameleon: Mixed-Modal Early-Fusion Foundation Models May 16, 2024 Image Captioning Image Generation
Code Code Available 7Adding Conditional Control to Text-to-Image Diffusion Models Feb 10, 2023 Image Generation Layout-to-Image Generation
Code Code Available 7Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining Aug 5, 2024 Decoder Depth Estimation
Code Code Available 7Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers Jan 21, 2024 Image Generation
Code Code Available 7SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization Nov 17, 2024 Image Generation Quantization
Code Code Available 7DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation Mar 13, 2024 Image Generation Prompt Engineering
Code Code Available 7SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration Oct 3, 2024 Image Generation Quantization
Code Code Available 7InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation Apr 3, 2024 Image Generation Text to Image Generation
Code Code Available 7In-Context LoRA for Diffusion Transformers Oct 31, 2024 Image Generation
Code Code Available 7Improving Sample Quality of Diffusion Models Using Self-Attention Guidance Oct 3, 2022 Denoising Diversity
Code Code Available 7OmniGen: Unified Image Generation Sep 17, 2024 Edge Detection Image Generation
Code Code Available 7HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer May 28, 2025 Image Generation Mixture-of-Experts
Code Code Available 7Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding May 14, 2024 Image Generation Language Modeling
Code Code Available 7Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data Oct 24, 2024 Image Generation Question Generation
Code Code Available 7OmniGen2: Exploration to Advanced Multimodal Generation Jun 23, 2025 Image Generation multimodal generation
Code Code Available 7Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT Jun 5, 2024 Image Generation Point Cloud Generation
Code Code Available 7PuLID: Pure and Lightning ID Customization via Contrastive Alignment Apr 24, 2024 Image Generation Text to Image Generation
Code Code Available 7PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models Jan 10, 2024 GPU Image Generation
Code Code Available 7InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity Mar 20, 2025 Image Generation
Code Code Available 7Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages Aug 23, 2023 Image Generation Image to text
Code Code Available 6Versatile Diffusion: Text, Images and Variations All in One Diffusion Model Nov 15, 2022 All Disentanglement
Code Code Available 6Semi-Parametric Neural Image Synthesis Apr 25, 2022 Image Generation Retrieval
Code Code Available 6Pseudo Numerical Methods for Diffusion Models on Manifolds Feb 20, 2022 Denoising Image Generation
Code Code Available 6StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation Dec 19, 2023 Denoising Image Generation
Code Code Available 6Adversarial Diffusion Distillation Nov 28, 2023 Image Generation
Code Code Available 6PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding Dec 7, 2023 Diffusion Personalization Diffusion Personalization Tuning Free
Code Code Available 6Better speech synthesis through scaling May 12, 2023 Image Generation Speech Synthesis
Code Code Available 6Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models Jul 26, 2022 Image Generation Prompt Engineering
Code Code Available 6Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Apr 2, 2025 Conditional Image Generation Image Generation
Code Code Available 5Consistency Models Mar 2, 2023 Colorization Image Generation
Code Code Available 5Magic Clothing: Controllable Garment-Driven Image Synthesis Apr 15, 2024 Image Generation
Code Code Available 5CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion Mar 8, 2024 Computational Efficiency Image Generation
Code Code Available 5EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts Jun 13, 2024 Conditional Image Generation Image Generation
Code Code Available 5CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models Jul 21, 2024 All Fashion Synthesis
Code Code Available 5FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification Oct 14, 2024 Image Generation
Code Code Available 5Fractal Generative Models Feb 24, 2025 Image Generation
Code Code Available 5