Elucidating the Design Space of Diffusion-Based Generative Models Jun 1, 2022 Image Generation
Code Code Available 4High-Resolution Image Synthesis with Latent Diffusion Models Dec 20, 2021 Denoising GPU
Code Code Available 4ControlVAE: Tuning, Analytical Properties, and Performance Analysis Oct 31, 2020 Disentanglement Image Generation
Code Code Available 4DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge Jul 6, 2025 Image Generation Multimodal Reasoning
Code Code Available 3Ovis-U1 Technical Report Jun 29, 2025 Image Generation Text to Image Generation
Code Code Available 3ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation Jun 22, 2025 GPU Image Generation
Code Code Available 3Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens Jun 20, 2025 Image Generation Multimodal Reasoning
Code Code Available 3Highly Compressed Tokenizer Can Generate Without Training Jun 9, 2025 Image Generation Quantization
Code Code Available 3Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation Jun 2, 2025 4k Descriptive
Code Code Available 3Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing Apr 30, 2025 Image Generation
Code Code Available 3PixelHacker: Image Inpainting with Structural and Semantic Consistency Apr 29, 2025 Denoising Image Generation
Code Code Available 3REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers Apr 15, 2025 Image Generation
Code Code Available 3GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation Apr 11, 2025 Decoder Image Generation
Code Code Available 3PixelFlow: Pixel-Space Generative Models with Flow Apr 10, 2025 Conditional Image Generation Image Generation
Code Code Available 3DDT: Decoupled Diffusion Transformer Apr 8, 2025 Denoising Image Generation
Code Code Available 3GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Apr 3, 2025 Image Generation World Knowledge
Code Code Available 3VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning Apr 3, 2025 Image Generation Instruction Following
Code Code Available 3AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction Apr 1, 2025 Image Generation
Code Code Available 3AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous Agents Mar 31, 2025 Image Generation Text to Image Generation
Code Code Available 3Optimal Stepsize for Diffusion Sampling Mar 27, 2025 Denoising Image Generation
Code Code Available 3Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models Mar 24, 2025 4k Image Generation
Code Code Available 3Halton Scheduler For Masked Generative Image Transformer Mar 21, 2025 Image Generation Text to Image Generation
Code Code Available 3GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing Mar 13, 2025 Image Generation Language Modeling
Code Code Available 3Robust Latent Matters: Boosting Image Generation with Sampling Error Mar 11, 2025 Benchmarking Image Generation
Code Code Available 3Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation Mar 3, 2025 3D Generation 3D Reconstruction
Code Code Available 3Attention Distillation: A Unified Approach to Visual Characteristics Transfer Feb 27, 2025 Denoising Image Generation
Code Code Available 3Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation Feb 27, 2025 Image Generation token-classification
Code Code Available 3ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Feb 25, 2025 Image Generation
Code Code Available 3DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Feb 24, 2025 Conditional Image Generation Image Generation
Code Code Available 3Personalized Image Generation with Deep Generative Models: A Decade Survey Feb 18, 2025 Image Generation Personalized Image Generation
Code Code Available 3LLMs can see and hear without any training Jan 30, 2025 Audio captioning Image Generation
Code Code Available 3One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt Jan 23, 2025 Image Generation Story Generation
Code Code Available 3VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model Jan 21, 2025 Image Generation Instruction Following
Code Code Available 33DIS-FLUX: simple and efficient multi-instance generation with DiT rendering Jan 9, 2025 Image Generation Text to Image Generation
Code Code Available 3CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up Dec 20, 2024 8k GPU
Code Code Available 3Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance Dec 17, 2024 Image Generation Object
Code Code Available 3SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer Dec 14, 2024 Denoising Image Generation
Code Code Available 3TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation Dec 4, 2024 Image Generation Image Reconstruction
Code Code Available 3MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost Dec 2, 2024 Image Generation
Code Code Available 3Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework Oct 28, 2024 Image Generation Image Manipulation
Code Code Available 3FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model Oct 17, 2024 Computational Efficiency Image Cropping
Code Code Available 33DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation Oct 16, 2024 Attribute Image Generation
Code Code Available 3CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation Oct 12, 2024 Conditional Image Generation GPU
Code Code Available 3SceneCraft: Layout-Guided 3D Scene Generation Oct 11, 2024 3D Generation Image Generation
Code Code Available 3Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis Oct 10, 2024 Feature Compression Image Generation
Code Code Available 3AP-LDM: Attentive and Progressive Latent Diffusion Model for Training-Free High-Resolution Image Generation Oct 8, 2024 Denoising Image Generation
Code Code Available 3ControlAR: Controllable Image Generation with Autoregressive Models Oct 3, 2024 Image Generation
Code Code Available 3Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding Oct 2, 2024 Image Generation Text to Image Generation
Code Code Available 3ImageFolder: Autoregressive Image Generation with Folded Tokens Oct 2, 2024 Image Generation Image Reconstruction
Code Code Available 3Simple and Fast Distillation of Diffusion Models Sep 29, 2024 GPU Image Generation
Code Code Available 3