| A Pedestrian is Worth One Prompt: Towards Language Guidance Person Re-Identification | Jan 1, 2024 | AttributePerson Re-Identification | —Unverified | 0 |
| Boosting Spike Camera Image Reconstruction from a Perspective of Dealing with Spike Fluctuations | Jan 1, 2024 | AttributeImage Reconstruction | CodeCode Available | 1 |
| Jack of All Tasks Master of Many: Designing General-Purpose Coarse-to-Fine Vision-Language Model | Jan 1, 2024 | AllAttribute | —Unverified | 0 |
| Beyond Seen Primitive Concepts and Attribute-Object Compositional Learning | Jan 1, 2024 | AttributeCompositional Zero-Shot Learning | —Unverified | 0 |
| Attribute-Guided Pedestrian Retrieval: Bridging Person Re-ID with Internal Attribute Variability | Jan 1, 2024 | AttributeImage Retrieval | —Unverified | 0 |
| MaskPLAN: Masked Generative Layout Planning from Partial Input | Jan 1, 2024 | AttributeDesign Synthesis | —Unverified | 0 |
| Multi-Attribute Interactions Matter for 3D Visual Grounding | Jan 1, 2024 | 3D visual groundingAttribute | CodeCode Available | 0 |
| 3D-Aware Face Editing via Warping-Guided Latent Direction Learning | Jan 1, 2024 | AttributeFacial Editing | —Unverified | 0 |
| Investigating Compositional Challenges in Vision-Language Models for Visual Grounding | Jan 1, 2024 | AttributeRelation | CodeCode Available | 0 |
| Synthesize Diagnose and Optimize: Towards Fine-Grained Vision-Language Understanding | Jan 1, 2024 | Attribute | CodeCode Available | 2 |