| No time to train! Training-Free Reference-Based Instance Segmentation | Jul 3, 2025 | Cross-Domain Few-Shot Object DetectionFew-Shot Object Detection | CodeCode Available | 3 |
| nnInteractive: Redefining 3D Promptable Segmentation | Mar 11, 2025 | BenchmarkingInteractive Segmentation | CodeCode Available | 3 |
| UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface | Mar 3, 2025 | Instance SegmentationReasoning Segmentation | CodeCode Available | 3 |
| SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures | Mar 3, 2025 | Crack SegmentationMamba | CodeCode Available | 3 |
| ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features | Feb 6, 2025 | Image SegmentationSegmentation | CodeCode Available | 3 |
| VISTA3D: A Unified Segmentation Foundation Model For 3D Medical Imaging | Jan 1, 2025 | Interactive SegmentationSegmentation | CodeCode Available | 3 |
| Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle | Dec 2, 2024 | Human Instance SegmentationPose-Based Human Instance Segmentation | CodeCode Available | 3 |
| Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline | Nov 19, 2024 | Image SegmentationInteractive Segmentation | CodeCode Available | 3 |
| ZIM: Zero-Shot Image Matting for Anything | Nov 1, 2024 | Image InpaintingImage Matting | CodeCode Available | 3 |
| SMITE: Segment Me In TimE | Oct 24, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 3 |
| Rethinking the Evaluation of Visible and Infrared Image Fusion | Oct 9, 2024 | object-detectionObject Detection | CodeCode Available | 3 |
| SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners | Aug 29, 2024 | Segmentation | CodeCode Available | 3 |
| InstanSeg: an embedding-based instance segmentation algorithm optimized for accurate, efficient and portable cell segmentation | Aug 28, 2024 | Cell SegmentationGPU | CodeCode Available | 3 |
| SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation | Aug 16, 2024 | Image SegmentationMarine Animal Segmentation | CodeCode Available | 3 |
| 5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks | Aug 15, 2024 | image-classificationImage Classification | CodeCode Available | 3 |
| Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2 | Aug 3, 2024 | DiversitySegmentation | CodeCode Available | 3 |
| EAFormer: Scene Text Segmentation with Edge-Aware Transformers | Jul 24, 2024 | DecoderSegmentation | CodeCode Available | 3 |
| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 |
| xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart | Jul 1, 2024 | 3D Medical Imaging Segmentationimage-classification | CodeCode Available | 3 |
| EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model | Jun 28, 2024 | Interactive SegmentationLanguage Modeling | CodeCode Available | 3 |
| Segment Anything without Supervision | Jun 28, 2024 | ClusteringImage Segmentation | CodeCode Available | 3 |
| Point-SAM: Promptable 3D Segmentation Model for Point Clouds | Jun 25, 2024 | Image SegmentationSegmentation | CodeCode Available | 3 |
| RobustSAM: Segment Anything Robustly on Degraded Images | Jun 13, 2024 | DeblurringImage Dehazing | CodeCode Available | 3 |
| VISTA3D: Versatile Imaging SegmenTation and Annotation model for 3D Computed Tomography | Jun 7, 2024 | Computed Tomography (CT)Image Segmentation | CodeCode Available | 3 |
| GroundGrid:LiDAR Point Cloud Ground Segmentation and Terrain Estimation | May 24, 2024 | Autonomous VehiclesSegmentation | CodeCode Available | 3 |
| CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation | May 17, 2024 | DecoderMamba | CodeCode Available | 3 |
| EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation | May 11, 2024 | Computational EfficiencyDecoder | CodeCode Available | 3 |
| Moving Object Segmentation: All You Need Is SAM (and Flow) | Apr 18, 2024 | AllMotion Segmentation | CodeCode Available | 3 |
| How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything Model | Apr 15, 2024 | DecoderImage Segmentation | CodeCode Available | 3 |
| SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation | Apr 15, 2024 | Brain Tumor SegmentationDecoder | CodeCode Available | 3 |
| Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation | Apr 5, 2024 | DecoderMamba | CodeCode Available | 3 |
| UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation | Mar 29, 2024 | Image SegmentationLesion Segmentation | CodeCode Available | 3 |
| PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model | Mar 21, 2024 | DecoderGeneralized Referring Expression Segmentation | CodeCode Available | 3 |
| MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Mar 20, 2024 | Aerial Scene ClassificationBuilding change detection for remote sensing images | CodeCode Available | 3 |
| SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM | Feb 5, 2024 | 3D Semantic SegmentationCamera Pose Estimation | CodeCode Available | 3 |
| Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation | Jan 31, 2024 | Hierarchical Text Segmentationparameter-efficient fine-tuning | CodeCode Available | 3 |
| pix2gestalt: Amodal Segmentation by Synthesizing Wholes | Jan 25, 2024 | 3D ReconstructionObject Recognition | CodeCode Available | 3 |
| RAP-SAM: Towards Real-Time All-Purpose Segment Anything | Jan 18, 2024 | AllDecoder | CodeCode Available | 3 |
| Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation | Jan 1, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 3 |
| Towards Automatic Power Battery Detection: New Challenge Benchmark Dataset and Baseline | Jan 1, 2024 | Crowd Countingobject-detection | CodeCode Available | 3 |
| UniGS: Unified Representation for Image Generation and Segmentation | Dec 4, 2023 | Image GenerationSegmentation | CodeCode Available | 3 |
| SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masks | Nov 20, 2023 | DiversityImage Segmentation | CodeCode Available | 3 |
| Rethinking Evaluation Metrics of Open-Vocabulary Segmentaion | Nov 6, 2023 | Segmentation | CodeCode Available | 3 |
| Putting the Object Back into Video Object Segmentation | Oct 19, 2023 | ObjectSegmentation | CodeCode Available | 3 |
| Tracking Anything with Decoupled Video Segmentation | Sep 7, 2023 | Open-Vocabulary Video SegmentationOpen-World Video Segmentation | CodeCode Available | 3 |
| SAM-Med2D | Aug 30, 2023 | DecoderImage Segmentation | CodeCode Available | 3 |
| Emergence of Segmentation with Minimalistic White-Box Transformers | Aug 30, 2023 | SegmentationSelf-Supervised Learning | CodeCode Available | 3 |
| VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation | Aug 28, 2023 | Instance SegmentationOptical Flow Estimation | CodeCode Available | 3 |
| Segment Anything Meets Point Tracking | Jul 3, 2023 | Interactive Video Object SegmentationObject | CodeCode Available | 3 |
| SAM3D: Segment Anything in 3D Scenes | Jun 6, 2023 | Segmentation | CodeCode Available | 3 |