| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 |
| Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models | May 29, 2024 | 3D Instance Segmentation3D Semantic Segmentation | —Unverified | 0 |
| Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model | May 27, 2024 | DecoderLanguage Modeling | CodeCode Available | 2 |
| LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning | Apr 12, 2024 | Image SegmentationLanguage Modeling | CodeCode Available | 2 |
| CoReS: Orchestrating the Dance of Reasoning and Segmentation | Apr 8, 2024 | Reasoning SegmentationSegmentation | CodeCode Available | 1 |
| Empowering Segmentation Ability to Multi-modal Large Language Models | Mar 21, 2024 | Dialogue GenerationReasoning Segmentation | CodeCode Available | 0 |
| LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model | Dec 28, 2023 | Instance SegmentationLanguage Modeling | CodeCode Available | 4 |
| FoodLMM: A Versatile Food Assistant using Large Multi-modal Model | Dec 22, 2023 | Food RecognitionMulti-Task Learning | —Unverified | 0 |
| PixelLM: Pixel Reasoning with Large Multimodal Model | Dec 4, 2023 | Decodermodel | CodeCode Available | 2 |
| Beyond Segmentation: Road Network Generation with Multi-Modal LLMs | Oct 15, 2023 | Autonomous NavigationLanguage Modeling | —Unverified | 0 |