| Pixel-Level Reasoning Segmentation via Multi-turn Conversations | Feb 13, 2025 | Reasoning SegmentationSegmentation | CodeCode Available | 0 |
| POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation | Jan 1, 2025 | HallucinationReasoning Segmentation | —Unverified | 0 |
| PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation | Dec 19, 2024 | Reasoning Segmentation | —Unverified | 0 |
| Multimodal 3D Reasoning Segmentation with Complex Scenes | Nov 21, 2024 | Reasoning SegmentationScene Understanding | —Unverified | 0 |
| Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level | Nov 15, 2024 | Benchmarkingcounterfactual | —Unverified | 0 |
| SegLLM: Multi-round Reasoning Segmentation | Oct 24, 2024 | Reasoning SegmentationReferring Expression | —Unverified | 0 |
| One Framework to Rule Them All: Unifying Multimodal Tasks with LLM Neural-Tuning | Aug 6, 2024 | AllImage Captioning | —Unverified | 0 |
| Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models | May 29, 2024 | 3D Instance Segmentation3D Semantic Segmentation | —Unverified | 0 |
| Empowering Segmentation Ability to Multi-modal Large Language Models | Mar 21, 2024 | Dialogue GenerationReasoning Segmentation | CodeCode Available | 0 |
| FoodLMM: A Versatile Food Assistant using Large Multi-modal Model | Dec 22, 2023 | Food RecognitionMulti-Task Learning | —Unverified | 0 |