| Subobject-level Image Tokenization | Feb 22, 2024 | AttributeLanguage Modeling | CodeCode Available | 2 |
| Oceanship: A Large-Scale Dataset for Underwater Audio Target Recognition | Jan 4, 2024 | AttributeAudio Classification | CodeCode Available | 2 |
| Synthesize Diagnose and Optimize: Towards Fine-Grained Vision-Language Understanding | Jan 1, 2024 | Attribute | CodeCode Available | 2 |
| When StyleGAN Meets Stable Diffusion: a W+ Adapter for Personalized Image Generation | Jan 1, 2024 | AttributeDisentanglement | CodeCode Available | 2 |
| SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing | Dec 20, 2023 | AttributeCross-Modal Retrieval | CodeCode Available | 2 |
| Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers | Dec 13, 2023 | 3D Question Answering (3D-QA)Attribute | CodeCode Available | 2 |
| RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models | Dec 7, 2023 | AttributeVideo Editing | CodeCode Available | 2 |
| GenEval: An Object-Focused Framework for Evaluating Text-to-Image Alignment | Oct 17, 2023 | AttributeObject | CodeCode Available | 2 |
| HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending | Oct 16, 2023 | Attribute | CodeCode Available | 2 |
| BlendFace: Re-designing Identity Encoders for Face-Swapping | Jul 20, 2023 | AttributeDisentanglement | CodeCode Available | 2 |