| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 |
| LLaRA: Supercharging Robot Learning Data for Vision-Language Policy | Jun 28, 2024 | Vision-Language-ActionWorld Knowledge | CodeCode Available | 3 |
| Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation | Apr 15, 2024 | Contrastive LearningDescriptive | CodeCode Available | 3 |
| GS2Mesh: Surface Reconstruction from Gaussian Splatting via Novel Stereo Views | Apr 2, 2024 | 3DGSNovel View Synthesis | CodeCode Available | 3 |
| Are We on the Right Way for Evaluating Large Vision-Language Models? | Mar 29, 2024 | World Knowledge | CodeCode Available | 3 |
| Unified Source-Free Domain Adaptation | Mar 12, 2024 | Domain AdaptationLanguage Modelling | CodeCode Available | 3 |
| Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models | Sep 3, 2023 | HallucinationWorld Knowledge | CodeCode Available | 3 |
| How Can Recommender Systems Benefit from Large Language Models: A Survey | Jun 9, 2023 | EthicsFeature Engineering | CodeCode Available | 3 |
| ConTextTab: A Semantics-Aware Tabular In-Context Learner | Jun 12, 2025 | In-Context LearningWorld Knowledge | CodeCode Available | 2 |
| Free-form language-based robotic reasoning and grasping | Mar 17, 2025 | FormRobotic Grasping | CodeCode Available | 2 |