| VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning | May 17, 2025 | 2D Object DetectionObject Counting | CodeCode Available | 4 |
| Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement | Mar 9, 2025 | Domain GeneralizationObject Detection | CodeCode Available | 4 |
| LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model | Dec 28, 2023 | Instance SegmentationLanguage Modeling | CodeCode Available | 4 |
| LISA: Reasoning Segmentation via Large Language Model | Aug 1, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface | Mar 3, 2025 | Instance SegmentationReasoning Segmentation | CodeCode Available | 3 |
| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 |
| Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning | Jun 27, 2025 | Foreground Segmentationobject-detection | CodeCode Available | 2 |
| The Devil is in Temporal Token: High Quality Video Reasoning Segmentation | Jan 15, 2025 | Reasoning SegmentationReferring Expression Segmentation | CodeCode Available | 2 |
| HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver | Jan 1, 2025 | Reasoning SegmentationSegmentation | CodeCode Available | 2 |
| InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models | Dec 18, 2024 | Reasoning SegmentationSegmentation | CodeCode Available | 2 |