| SegVol: Universal and Interactive Volumetric Medical Image Segmentation | Nov 22, 2023 | Computed Tomography (CT)Image Segmentation | CodeCode Available | 2 |
| Open-Vocabulary Camouflaged Object Segmentation | Nov 19, 2023 | Camouflaged Object SegmentationImage Segmentation | CodeCode Available | 2 |
| SpectralGPT: Spectral Remote Sensing Foundation Model | Nov 13, 2023 | Change Detectionmodel | CodeCode Available | 2 |
| GLaMM: Pixel Grounding Large Multimodal Model | Nov 6, 2023 | Conversational Question AnsweringImage Captioning | CodeCode Available | 2 |
| Medical Image Segmentation with Domain Adaptation: A Survey | Nov 3, 2023 | Domain AdaptationImage Segmentation | CodeCode Available | 2 |
| EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision | Nov 3, 2023 | Optical Flow EstimationSemantic Segmentation | CodeCode Available | 2 |
| TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition | Oct 30, 2023 | Image ClassificationObject Detection | CodeCode Available | 2 |
| SAM-Med3D: Towards General-purpose Segmentation Models for Volumetric Medical Images | Oct 23, 2023 | 3D ArchitectureImage Segmentation | CodeCode Available | 2 |
| IDRNet: Intervention-Driven Relation Network for Semantic Segmentation | Oct 16, 2023 | RelationRelation Network | CodeCode Available | 2 |
| UniPAD: A Universal Pre-training Paradigm for Autonomous Driving | Oct 12, 2023 | 3D Object Detection3D Semantic Segmentation | CodeCode Available | 2 |
| PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm | Oct 12, 2023 | 3D Object Detection3D Reconstruction | CodeCode Available | 2 |
| CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction | Oct 2, 2023 | image-classificationImage Classification | CodeCode Available | 2 |
| nnSAM: Plug-and-play Segment Anything Model Improves nnUNet Performance | Sep 29, 2023 | Few-Shot LearningHeart Segmentation | CodeCode Available | 2 |
| EPTQ: Enhanced Post-Training Quantization via Hessian-guided Network-wise Optimization | Sep 20, 2023 | Knowledge Distillationobject-detection | CodeCode Available | 2 |
| RMT: Retentive Networks Meet Vision Transformers | Sep 20, 2023 | Instance Segmentationobject-detection | CodeCode Available | 2 |
| DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation | Sep 18, 2023 | 3D geometryDecoder | CodeCode Available | 2 |
| Beyond Adapting SAM: Towards End-to-End Ultrasound Image Segmentation via Auto Prompting | Sep 13, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase | Sep 11, 2023 | 3D Semantic SegmentationLIDAR Semantic Segmentation | CodeCode Available | 2 |
| DAT++: Spatially Dynamic Vision Transformer with Deformable Attention | Sep 4, 2023 | Image ClassificationInstance Segmentation | CodeCode Available | 2 |
| RevColV2: Exploring Disentangled Representations in Masked Image Modeling | Sep 2, 2023 | Decoderimage-classification | CodeCode Available | 2 |
| OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation | Sep 1, 2023 | 3D Open-Vocabulary Instance Segmentation3D Open-Vocabulary Object Detection | CodeCode Available | 2 |
| Beyond Self-Attention: Deformable Large Kernel Attention for Medical Image Segmentation | Aug 31, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion | Aug 23, 2023 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Dataset Quantization | Aug 21, 2023 | Dataset Distillationobject-detection | CodeCode Available | 2 |
| MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions | Aug 16, 2023 | Motion Expressions Guided Video SegmentationObject | CodeCode Available | 2 |
| Tiny and Efficient Model for the Edge Detection Generalization | Aug 12, 2023 | Boundary DetectionContour Detection | CodeCode Available | 2 |
| DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models | Aug 11, 2023 | Dataset GenerationDecoder | CodeCode Available | 2 |
| Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP | Aug 4, 2023 | Open Vocabulary Panoptic SegmentationOpen Vocabulary Semantic Segmentation | CodeCode Available | 2 |
| XMem++: Production-level Video Segmentation From Few Annotated Frames | Jul 29, 2023 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Tracking Anything in High Quality | Jul 26, 2023 | ObjectObject Tracking | CodeCode Available | 2 |
| CNOS: A Strong Baseline for CAD-based Novel Object Segmentation | Jul 20, 2023 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| Scale-Aware Modulation Meet Transformer | Jul 17, 2023 | object-detectionObject Detection | CodeCode Available | 2 |
| EGE-UNet: an Efficient Group Enhanced UNet for skin lesion segmentation | Jul 17, 2023 | DecoderImage Segmentation | CodeCode Available | 2 |
| Hierarchical Open-vocabulary Universal Image Segmentation | Jul 3, 2023 | Image ComprehensionImage Segmentation | CodeCode Available | 2 |
| MIS-FM: 3D Medical Image Segmentation using Foundation Models Pretrained on a Large-Scale Unannotated Dataset | Jun 29, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model | Jun 28, 2023 | Image SegmentationInstance Segmentation | CodeCode Available | 2 |
| CellViT: Vision Transformers for Precise Cell Segmentation and Classification | Jun 27, 2023 | Cell DetectionCell Segmentation | CodeCode Available | 2 |
| MedLSAM: Localize and Segment Anything Model for 3D CT Images | Jun 26, 2023 | Image SegmentationMedical Image Analysis | CodeCode Available | 2 |
| 3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation | Jun 23, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| OpenMask3D: Open-Vocabulary 3D Instance Segmentation | Jun 23, 2023 | 3D Instance Segmentation3D Open-Vocabulary Instance Segmentation | CodeCode Available | 2 |
| Efficient 3D Semantic Segmentation with Superpoint Transformer | Jun 13, 2023 | 3D Semantic SegmentationGPU | CodeCode Available | 2 |
| SegViTv2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers | Jun 9, 2023 | Continual LearningContinual Semantic Segmentation | CodeCode Available | 2 |
| Does Image Anonymization Impact Computer Vision Training? | Jun 8, 2023 | Face AnonymizationInstance Segmentation | CodeCode Available | 2 |
| Using Unreliable Pseudo-Labels for Label-Efficient Semantic Segmentation | Jun 4, 2023 | Semantic Segmentation | CodeCode Available | 2 |
| SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model | Jun 4, 2023 | 3D Object DetectionImage Segmentation | CodeCode Available | 2 |
| Contextual Object Detection with Multimodal Large Language Models | May 29, 2023 | Cloze TestDecoder | CodeCode Available | 2 |
| SSSegmenation: An Open Source Supervised Semantic Segmentation Toolbox Based on PyTorch | May 26, 2023 | Image SegmentationSegmentation | CodeCode Available | 2 |
| A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence | May 24, 2023 | Dense Pixel Correspondence EstimationRepresentation Learning | CodeCode Available | 2 |
| SAD: Segment Any RGBD | May 23, 2023 | 3D Panoptic SegmentationOpen Vocabulary Semantic Segmentation | CodeCode Available | 2 |
| Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching | May 22, 2023 | AllFew-Shot Semantic Segmentation | CodeCode Available | 2 |