Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes Apr 14, 2025 Image Generation Large Language Model
Code Code Available 1Flux Already Knows -- Activating Subject-Driven Image Generation without Training Apr 12, 2025 Image Generation Virtual Try-on
Code Code Available 2seg2med: a bridge from artificial anatomy to multimodal medical images Apr 12, 2025 Anatomy Data Augmentation
— Unverified 0Towards Explainable Partial-AIGC Image Quality Assessment Apr 12, 2025 Image Generation Image Manipulation
— Unverified 0Discriminator-Free Direct Preference Optimization for Video Diffusion Apr 11, 2025 Image Generation
— Unverified 0On the Design of Diffusion-based Neural Speech Codecs Apr 11, 2025 Audio Generation Image Generation
— Unverified 0LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs Apr 11, 2025 Benchmarking Image Generation
Code Code Available 1CoProSketch: Controllable and Progressive Sketch Generation with Diffusion Model Apr 11, 2025 Image Generation
— Unverified 0GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation Apr 11, 2025 Decoder Image Generation
Code Code Available 3Generating Fine Details of Entity Interactions Apr 11, 2025 Image Generation
— Unverified 0Muon-Accelerated Attention Distillation for Real-Time Edge Synthesis via Optimized Latent Diffusion Apr 11, 2025 Image Generation Quantization
— Unverified 0Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging Apr 11, 2025 Attribute Computational Efficiency
Code Code Available 1MixDiT: Accelerating Image Diffusion Transformer Inference with Mixed-Precision MX Quantization Apr 11, 2025 Image Generation Quantization
— Unverified 0Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment Apr 10, 2025 AI Agent Attribute
— Unverified 0POEM: Precise Object-level Editing via MLLM control Apr 10, 2025 Image Generation Object
— Unverified 0ID-Booth: Identity-consistent Face Generation with Diffusion Models Apr 10, 2025 Denoising Diversity
Code Code Available 1DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows Apr 10, 2025 Diversity Image Generation
— Unverified 0Model Discrepancy Learning: Synthetic Faces Detection Based on Multi-Reconstruction Apr 10, 2025 Face Detection Image Generation
— Unverified 0VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Apr 10, 2025 Image Generation In-Context Learning
— Unverified 0PixelFlow: Pixel-Space Generative Models with Flow Apr 10, 2025 Conditional Image Generation Image Generation
Code Code Available 3FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation Apr 10, 2025 Image Generation
— Unverified 0PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering Apr 9, 2025 Image Generation
— Unverified 0DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation Apr 9, 2025 Image Generation Text to Image Generation
Code Code Available 1Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability Apr 9, 2025 Image Generation multimodal generation
— Unverified 0A Unified Agentic Framework for Evaluating Conditional Image Generation Apr 9, 2025 Conditional Image Generation Image Generation
Code Code Available 1OmniCaptioner: One Captioner to Rule Them All Apr 9, 2025 All Image Captioning
Code Code Available 2Compass Control: Multi Object Orientation Control for Text-to-Image Generation Apr 9, 2025 Image Generation Object
— Unverified 0HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance Apr 8, 2025 Image Generation
Code Code Available 2Parasite: A Steganography-based Backdoor Attack Framework for Diffusion Models Apr 8, 2025 Backdoor Attack Image Generation
— Unverified 0Transfer between Modalities with MetaQueries Apr 8, 2025 Decoder
— Unverified 0An Empirical Study of GPT-4o Image Generation Capabilities Apr 8, 2025 Benchmarking Image Generation
Code Code Available 1D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition Apr 8, 2025 Image Generation Object
— Unverified 0A Training-Free Style-aligned Image Generation with Scale-wise Autoregressive Model Apr 8, 2025 Image Generation
— Unverified 0DDT: Decoupled Diffusion Transformer Apr 8, 2025 Denoising Image Generation
Code Code Available 3CDM-QTA: Quantized Training Acceleration for Efficient LoRA Fine-Tuning of Diffusion Model Apr 8, 2025 Image Generation
— Unverified 0Storybooth: Training-free Multi-Subject Consistency for Improved Visual Storytelling Apr 8, 2025 Image Generation Text to Image Generation
— Unverified 0Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive Jailbreaking Apr 8, 2025 Image Generation
Code Code Available 1Gaussian Mixture Flow Matching Models Apr 7, 2025 Denoising Image Generation
Code Code Available 2Generative Adversarial Networks with Limited Data: A Survey and Benchmarking Apr 7, 2025 Benchmarking Image Generation
— Unverified 0Multimodal Cinematic Video Synthesis Using Text-to-Image and Audio Generation Models Apr 6, 2025 Audio Generation GPU
— Unverified 0UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding Apr 6, 2025 Image Generation
Code Code Available 2Thermoxels: a voxel-based method to generate simulation-ready 3D thermal models Apr 6, 2025 3D Reconstruction Image Generation
— Unverified 0Digital Gene: Learning about the Physical World through Analytic Concepts Apr 5, 2025 Image Generation object-detection
— Unverified 0A Hybrid Wavelet-Fourier Method for Next-Generation Conditional Diffusion Models Apr 4, 2025 Conditional Image Generation Image Generation
— Unverified 0Dynamic Importance in Diffusion U-Net for Enhanced Image Synthesis Apr 4, 2025 Denoising Image Generation
Code Code Available 0QIRL: Boosting Visual Question Answering via Optimized Question-Image Relation Learning Apr 4, 2025 Data Augmentation Image Generation
— Unverified 0MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models Apr 4, 2025 Benchmarking Image Generation
— Unverified 0FLAIRBrainSeg: Fine-grained brain segmentation using FLAIR MRI only Apr 4, 2025 Brain Segmentation Image Generation
— Unverified 0Detection Limits and Statistical Separability of Tree Ring Watermarks in Rectified Flow-based Text-to-Image Generation Models Apr 4, 2025 Image Generation Text to Image Generation
Code Code Available 0Bias in Large Language Models Across Clinical Applications: A Systematic Review Apr 3, 2025 Image Generation
— Unverified 0