| MeaCap: Memory-Augmented Zero-shot Image Captioning | Mar 6, 2024 | Caption GenerationImage Captioning | CodeCode Available | 2 |
| WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition | Feb 22, 2024 | Image-level Supervised Instance Segmentationobject-detection | CodeCode Available | 2 |
| GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering | Feb 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Can AI Assistants Know What They Don't Know? | Jan 24, 2024 | MathOpen-Domain Question Answering | CodeCode Available | 2 |
| V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs | Dec 21, 2023 | Visual Question AnsweringWorld Knowledge | CodeCode Available | 2 |
| LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin | Dec 15, 2023 | Language ModellingMixture-of-Experts | CodeCode Available | 2 |
| CapsFusion: Rethinking Image-Text Data at Scale | Oct 31, 2023 | World Knowledge | CodeCode Available | 2 |
| FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation | Oct 5, 2023 | HallucinationWorld Knowledge | CodeCode Available | 2 |
| Grasp-Anything: Large-scale Grasp Dataset from Foundation Models | Sep 18, 2023 | DiversityRobotic Grasping | CodeCode Available | 2 |
| Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations | Aug 23, 2023 | BenchmarkingDecoder | CodeCode Available | 2 |