Autoregressive Image Generation without Vector Quantization Jun 17, 2024 Image Generation Quantization
Code Code Available 55 FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification Oct 14, 2024 Image Generation
Code Code Available 55 ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment Mar 8, 2024 Denoising Image Generation
Code Code Available 55 Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs Jan 22, 2024 Diffusion Personalization Tuning Free Image Generation
Code Code Available 55 Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation Mar 12, 2024 Image Generation Language Modelling
Code Code Available 55 BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset May 14, 2025 Image Generation
Code Code Available 55 Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Jun 10, 2024 Conditional Image Generation Image Generation
Code Code Available 55 MV-Adapter: Multi-view Consistent Image Generation Made Easy Dec 4, 2024 3D Generation Image Generation
Code Code Available 55 VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models Nov 20, 2024 Benchmarking Image Generation
Code Code Available 55 DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation Aug 25, 2022 Diffusion Personalization Image Generation
Code Code Available 55 SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers Jan 16, 2024 Image Generation
Code Code Available 55 Diffusion for World Modeling: Visual Details Matter in Atari May 20, 2024 Image Generation reinforcement-learning
Code Code Available 55 DreamFusion: Text-to-3D using 2D Diffusion Sep 29, 2022 Denoising Image Generation
Code Code Available 55 VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks Jun 12, 2024 Image Generation Language Modeling
Code Code Available 55 Learning Flow Fields in Attention for Controllable Person Image Generation Dec 11, 2024 Attribute Image Generation
Code Code Available 55 Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Apr 2, 2025 Conditional Image Generation Image Generation
Code Code Available 55 Scalable Diffusion Models with Transformers Dec 19, 2022 Image Generation
Code Code Available 55 Magic Clothing: Controllable Garment-Driven Image Synthesis Apr 15, 2024 Image Generation
Code Code Available 55 IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models Aug 13, 2023 Diffusion Personalization Tuning Free Image Generation
Code Code Available 55 IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation Oct 9, 2024 Attribute Image Generation
Code Code Available 55 Show-o: One Single Transformer to Unify Multimodal Understanding and Generation Aug 22, 2024 10-shot image generation
Code Code Available 55 Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think Oct 9, 2024 Denoising Image Generation
Code Code Available 55 InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework Apr 16, 2025 Image Generation
Code Code Available 55 Improved Distribution Matching Distillation for Fast Image Synthesis May 23, 2024 Image Generation
Code Code Available 55 Randomized Autoregressive Visual Generation Nov 1, 2024 Image Generation Language Modeling
Code Code Available 55 IMAGDressing-v1: Customizable Virtual Dressing Jul 17, 2024 Denoising Image Generation
Code Code Available 55 Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models Jan 2, 2025 Image Generation
Code Code Available 55 An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion Aug 2, 2022 Image Generation Personalized Image Generation
Code Code Available 55 Consistency Models Mar 2, 2023 Colorization Image Generation
Code Code Available 55 Image Vectorization: a Review Jun 10, 2023 Image Generation Vector Graphics
Code Code Available 55 PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation Mar 7, 2024 4k Image Captioning
Code Code Available 55 EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts Jun 13, 2024 Conditional Image Generation Image Generation
Code Code Available 55 GLIGEN: Open-Set Grounded Text-to-Image Generation Jan 17, 2023 Conditional Text-to-Image Synthesis Image Generation
Code Code Available 45 One Diffusion to Generate Them All Nov 25, 2024 All Camera Pose Estimation
Code Code Available 45 Null-text Inversion for Editing Real Images using Guided Diffusion Models Nov 17, 2022 Image Generation Text-based Image Editing
Code Code Available 45 Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion Jan 27, 2023 GPU Image Generation
Code Code Available 45 OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models Mar 16, 2024 Denoising Image Generation
Code Code Available 45 Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation Jun 4, 2024 Face Swapping GPU
Code Code Available 45 MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis Feb 8, 2024 Attribute Conditional Text-to-Image Synthesis
Code Code Available 45 Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Jan 23, 2025 Image Generation Text-to-Image Generation
Code Code Available 45 MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis Jul 2, 2024 Attribute Image Generation
Code Code Available 45 Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction May 5, 2025 Image Generation multimodal interaction
Code Code Available 45 Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Sep 17, 2024 Conditional Image Generation Depth Estimation
Code Code Available 45 Guiding a Diffusion Model with a Bad Version of Itself Jun 4, 2024 Image Generation
Code Code Available 45 Elucidating the Design Space of Diffusion-Based Generative Models Jun 1, 2022 Image Generation
Code Code Available 45 Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications Jan 11, 2024 image-classification Image Classification
Code Code Available 45 Ming-Omni: A Unified Multimodal Model for Perception and Generation Jun 11, 2025 Image Generation text-to-speech
Code Code Available 45 Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation Sep 6, 2024 Image Generation Image Reconstruction
Code Code Available 45 Long-CLIP: Unlocking the Long-Text Capability of CLIP Mar 22, 2024 Image Generation Image Retrieval
Code Code Available 45 Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction Sep 26, 2024 3D Reconstruction Denoising
Code Code Available 45