DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Feb 24, 2025 Conditional Image Generation Image Generation
Code Code Available 35 DF40: Toward Next-Generation Deepfake Detection Jun 19, 2024 DeepFake Detection Face Reenactment
Code Code Available 35 AP-LDM: Attentive and Progressive Latent Diffusion Model for Training-Free High-Resolution Image Generation Oct 8, 2024 Denoising Image Generation
Code Code Available 35 MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer Jan 19, 2023 Image Generation Image Segmentation
Code Code Available 35 Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis Oct 10, 2024 Feature Compression Image Generation
Code Code Available 35 On the Trajectory Regularity of ODE-based Diffusion Sampling May 18, 2024 Denoising Image Generation
Code Code Available 35 Designing a Better Asymmetric VQGAN for StableDiffusion Jun 7, 2023 Decoder Image Generation
Code Code Available 35 MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model Nov 1, 2022 Anomaly Detection Brain Tumor Segmentation
Code Code Available 35 Magic-Me: Identity-Specific Video Customized Diffusion Feb 14, 2024 Image Generation Text to Image Generation
Code Code Available 35 Ovis-U1 Technical Report Jun 29, 2025 Image Generation Text to Image Generation
Code Code Available 35 Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens Jun 20, 2025 Image Generation Multimodal Reasoning
Code Code Available 35 MaskGIT: Masked Generative Image Transformer Feb 8, 2022 Decoder Image Generation
Code Code Available 35 DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing Mar 21, 2024 Image Generation spatial-aware image editing
Code Code Available 35 Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation Feb 27, 2025 Image Generation token-classification
Code Code Available 35 LLMs can see and hear without any training Jan 30, 2025 Audio captioning Image Generation
Code Code Available 35 Deep Generative Models on 3D Representations: A Survey Oct 27, 2022 3D-Aware Image Synthesis 3D Shape Generation
Code Code Available 35 AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction Apr 1, 2025 Image Generation
Code Code Available 35 Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation Aug 19, 2024 Image Generation Video Generation
Code Code Available 35 An Image is Worth 32 Tokens for Reconstruction and Generation Jun 11, 2024 Image Generation Image Reconstruction
Code Code Available 35 3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering Jan 9, 2025 Image Generation Text to Image Generation
Code Code Available 35 Behavior Generation with Latent Actions Mar 5, 2024 Autonomous Driving Decision Making
Code Code Available 35 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation Oct 16, 2024 Attribute Image Generation
Code Code Available 35 Deciphering Oracle Bone Language with Diffusion Models Jun 2, 2024 Decipherment Image Generation
Code Code Available 35 Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation Mar 3, 2025 3D Generation 3D Reconstruction
Code Code Available 35 Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework Oct 28, 2024 Image Generation Image Manipulation
Code Code Available 35 CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation Oct 12, 2024 Conditional Image Generation GPU
Code Code Available 35 Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models Jan 1, 2024 Image Generation Text to Image Generation
Code Code Available 35 DDT: Decoupled Diffusion Transformer Apr 8, 2025 Denoising Image Generation
Code Code Available 35 DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models Feb 8, 2022 Diagnostic Image Captioning
Code Code Available 35 Emu3: Next-Token Prediction is All You Need Sep 27, 2024 All
Code Code Available 35 ModelScope Text-to-Video Technical Report Aug 12, 2023 Denoising Image Generation
Code Code Available 35 Anatomically-Controllable Medical Image Generation with Segmentation-Guided Diffusion Models Feb 7, 2024 counterfactual Image Generation
Code Code Available 35 Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance Dec 17, 2024 Image Generation Object
Code Code Available 35 InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation Sep 12, 2023 GPU Image Generation
Code Code Available 35 ControlAR: Controllable Image Generation with Autoregressive Models Oct 3, 2024 Image Generation
Code Code Available 35 Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer May 7, 2024 Image Generation Super-Resolution
Code Code Available 35 Consistency Models Made Easy Jun 20, 2024 Computational Efficiency GPU
Code Code Available 35 ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation Apr 12, 2023 Image Generation Preference Mapping
Code Code Available 35 All are Worth Words: A ViT Backbone for Diffusion Models Sep 25, 2022 All Conditional Image Generation
Code Code Available 35 Consistency Flow Matching: Defining Straight Flows with Velocity Consistency Jul 2, 2024 Image Generation
Code Code Available 35 Improved Denoising Diffusion Probabilistic Models Feb 18, 2021 Denoising Image Generation
Code Code Available 35 Image and Video Tokenization with Binary Spherical Quantization Jun 11, 2024 Decoder Image Generation
Code Code Available 35 AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation Jun 3, 2024 Image Generation
Code Code Available 35 ImageFolder: Autoregressive Image Generation with Folded Tokens Oct 2, 2024 Image Generation Image Reconstruction
Code Code Available 35 Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models Nov 20, 2023 Image Generation
Code Code Available 35 FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model Oct 17, 2024 Computational Efficiency Image Cropping
Code Code Available 35 ImageInWords: Unlocking Hyper-Detailed Image Descriptions May 5, 2024 Image Generation Specificity
Code Code Available 35 Hierarchical Text-Conditional Image Generation with CLIP Latents Apr 13, 2022 Conditional Image Generation Decoder
Code Code Available 35 Flow Matching for Generative Modeling Oct 6, 2022 Density Estimation Image Generation
Code Code Available 35 Highly Compressed Tokenizer Can Generate Without Training Jun 9, 2025 Image Generation Quantization
Code Code Available 35