| Natural scene reconstruction from fMRI signals using generative latent diffusion | Mar 9, 2023 | Brain Computer InterfaceBrain Decoding | CodeCode Available | 1 | 5 |
| A Sparse and Locally Coherent Morphable Face Model for Dense Semantic Correspondence Across Heterogeneous 3D Faces | Jun 6, 2020 | DescriptiveFace Model | CodeCode Available | 1 | 5 |
| FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models | Nov 2, 2023 | DescriptiveInstruction Following | CodeCode Available | 1 | 5 |
| FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes | Oct 15, 2021 | DescriptiveImage Classification | CodeCode Available | 1 | 5 |
| FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant | Aug 19, 2024 | DescriptiveFace Swapping | CodeCode Available | 1 | 5 |
| Enhancing CLIP with GPT-4: Harnessing Visual Descriptions as Prompts | Jul 21, 2023 | DescriptivePrompt Engineering | CodeCode Available | 1 | 5 |
| Comprehensive Information Integration Modeling Framework for Video Titling | Jun 24, 2020 | DescriptiveVideo Captioning | CodeCode Available | 1 | 5 |
| CENet: Toward Concise and Efficient LiDAR Semantic Segmentation for Autonomous Driving | Jul 26, 2022 | 3D Semantic SegmentationAutonomous Driving | CodeCode Available | 1 | 5 |
| Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search | Feb 2, 2021 | DescriptiveImage Generation | CodeCode Available | 1 | 5 |
| Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language Understanding | May 10, 2025 | DescriptiveEmotion Recognition | CodeCode Available | 1 | 5 |