| VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning | May 17, 2025 | 2D Object DetectionObject Counting | CodeCode Available | 4 |
| LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery | May 5, 2025 | Reasoning SegmentationSegmentation | —Unverified | 0 |
| SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding | Apr 17, 2025 | Image GenerationLarge Language Model | CodeCode Available | 1 |
| MediSee: Reasoning-based Pixel-level Perception in Medical Images | Apr 15, 2025 | Logical ReasoningReasoning Segmentation | —Unverified | 0 |
| LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation | Apr 15, 2025 | Image CaptioningQuestion Answering | —Unverified | 0 |
| Online Reasoning Video Segmentation with Just-in-Time Digital Twins | Mar 27, 2025 | Reasoning SegmentationVideo Segmentation | —Unverified | 0 |
| Operating Room Workflow Analysis via Reasoning Segmentation over Digital Twins | Mar 26, 2025 | Large Language ModelReasoning Segmentation | —Unverified | 0 |
| MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation | Mar 23, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation | Mar 18, 2025 | ObjectReasoning Segmentation | CodeCode Available | 1 |
| VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation | Mar 18, 2025 | Reasoning SegmentationVideo Editing | —Unverified | 0 |