| Causality-Inspired Fair Representation Learning for Multimodal Recommendation | Oct 26, 2023 | AttributeCausal Inference | CodeCode Available | 1 |
| Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval | Aug 1, 2024 | AttributeOptical Character Recognition | CodeCode Available | 1 |
| ComiCap: A VLMs pipeline for dense captioning of Comic Panels | Sep 24, 2024 | AttributeDense Captioning | CodeCode Available | 1 |
| From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency | Oct 7, 2024 | Attribute | CodeCode Available | 1 |
| FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions | May 28, 2023 | AttributeImage Captioning | CodeCode Available | 1 |
| GaitFormer: Learning Gait Representations with Noisy Multi-Task Learning | Oct 30, 2023 | AttributeMulti-Task Learning | CodeCode Available | 1 |
| Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle | Dec 6, 2023 | 3DGS3D Reconstruction | CodeCode Available | 1 |
| MLLM-CompBench: A Comparative Reasoning Benchmark for Multimodal LLMs | Jul 23, 2024 | Attribute | CodeCode Available | 1 |
| Gender Bias in Masked Language Models for Multiple Languages | May 1, 2022 | AttributeSentence | CodeCode Available | 1 |
| Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models | May 23, 2023 | AttributeImage Generation | CodeCode Available | 1 |