| TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models | May 29, 2025 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 2 |
| The Missing Point in Vision Transformers for Universal Image Segmentation | May 26, 2025 | Image SegmentationInstance Segmentation | CodeCode Available | 2 |
| Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration | May 26, 2025 | Domain GeneralizationHallucination | CodeCode Available | 2 |
| MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning | May 14, 2025 | Anomaly DetectionAnomaly Segmentation | CodeCode Available | 2 |
| Recent Advances in Medical Imaging Segmentation: A Survey | May 14, 2025 | Domain AdaptationFew-Shot Learning | CodeCode Available | 2 |
| Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation | May 9, 2025 | Image GenerationImage Segmentation | CodeCode Available | 2 |
| DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception | May 7, 2025 | object-detectionObject Detection | CodeCode Available | 2 |
| Rethinking Boundary Detection in Deep Learning-Based Medical Image Segmentation | May 6, 2025 | Boundary DetectionDecoder | CodeCode Available | 2 |
| Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook | May 1, 2025 | BenchmarkingChange Detection | CodeCode Available | 2 |
| Digital Twin Generation from Visual Data: A Survey | Apr 17, 2025 | Semantic SegmentationSurvey | CodeCode Available | 2 |
| The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer | Apr 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| P2Object: Single Point Supervised Object Detection and Instance Segmentation | Apr 10, 2025 | Instance SegmentationMultiple Instance Learning | CodeCode Available | 2 |
| GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation | Apr 10, 2025 | Contrastive LearningLanguage Modeling | CodeCode Available | 2 |
| Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation | Apr 8, 2025 | Domain AdaptationDomain Generalization | CodeCode Available | 2 |
| Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting | Apr 7, 2025 | Boundary DetectionObject | CodeCode Available | 2 |
| SlicerNNInteractive: A 3D Slicer extension for nnInteractive | Apr 7, 2025 | Image SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation | Apr 4, 2025 | Domain GeneralizationMamba | CodeCode Available | 2 |
| Delineate Anything: Resolution-Agnostic Field Boundary Delineation on Satellite Imagery | Apr 3, 2025 | Field Boundary DelineationInstance Segmentation | CodeCode Available | 2 |
| Scene-Centric Unsupervised Panoptic Segmentation | Apr 2, 2025 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 2 |
| A Unified Image-Dense Annotation Generation Model for Underwater Scenes | Mar 27, 2025 | Depth EstimationPrediction | CodeCode Available | 2 |
| Towards Generating Realistic 3D Semantic Training Data for Autonomous Driving | Mar 27, 2025 | 3D Semantic SegmentationAutonomous Driving | CodeCode Available | 2 |
| Exploring CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation | Mar 26, 2025 | AttributeSemantic Segmentation | CodeCode Available | 2 |
| COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting | Mar 25, 2025 | 3DGSObject | CodeCode Available | 2 |
| MaSS13K: A Matting-level Semantic Segmentation Benchmark | Mar 24, 2025 | 4kImage Matting | CodeCode Available | 2 |
| DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation | Mar 24, 2025 | 3D Semantic SegmentationLIDAR Semantic Segmentation | CodeCode Available | 2 |