| Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images | Mar 21, 2025 | Image SegmentationMamba | CodeCode Available | 2 |
| Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting | Mar 18, 2025 | Instance SegmentationObject | CodeCode Available | 2 |
| Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation | Mar 17, 2025 | Domain AdaptationDomain Generalization | CodeCode Available | 2 |
| HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model | Mar 17, 2025 | Image SegmentationSegmentation | CodeCode Available | 2 |
| RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing | Mar 13, 2025 | Computational EfficiencyMamba | CodeCode Available | 2 |
| DiffAtlas: GenAI-fying Atlas Segmentation via Image-Mask Diffusion | Mar 9, 2025 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation | Mar 5, 2025 | ObjectReferring Video Object Segmentation | CodeCode Available | 2 |
| Golden Cudgel Network for Real-Time Semantic Segmentation | Mar 5, 2025 | Real-Time Semantic SegmentationSemantic Segmentation | CodeCode Available | 2 |
| SemiSAM+: Rethinking Semi-Supervised Medical Image Segmentation in the Era of Foundation Models | Feb 28, 2025 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| DAMamba: Vision State Space Model with Dynamic Adaptive Scan | Feb 18, 2025 | image-classificationImage Classification | CodeCode Available | 2 |
| SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement | Feb 10, 2025 | Semantic Segmentation | CodeCode Available | 2 |
| Segment Anything for Histopathology | Feb 1, 2025 | Image SegmentationInstance Segmentation | CodeCode Available | 2 |
| iFormer: Integrating ConvNet and Transformer for Mobile Application | Jan 26, 2025 | Instance Segmentationobject-detection | CodeCode Available | 2 |
| LWGANet: A Lightweight Group Attention Backbone for Remote Sensing Visual Tasks | Jan 17, 2025 | Change DetectionImage Classification | CodeCode Available | 2 |
| Scaling up self-supervised learning for improved surgical foundation models | Jan 16, 2025 | Self-Supervised LearningSemantic Segmentation | CodeCode Available | 2 |
| Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation | Jan 15, 2025 | Image SegmentationReferring Expression Segmentation | CodeCode Available | 2 |
| RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation | Jan 14, 2025 | Computational EfficiencyImage Segmentation | CodeCode Available | 2 |
| RSRefSeg: Referring Remote Sensing Image Segmentation with Foundation Models | Jan 12, 2025 | Image SegmentationSegmentation | CodeCode Available | 2 |
| Merging Context Clustering with Visual State Space Models for Medical Image Segmentation | Jan 3, 2025 | ClusteringImage Segmentation | CodeCode Available | 2 |
| nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation Benchmark | Jan 1, 2025 | BenchmarkingImage Segmentation | CodeCode Available | 2 |
| Towards Open-Vocabulary Remote Sensing Image Semantic Segmentation | Dec 27, 2024 | Image SegmentationSemantic Segmentation | CodeCode Available | 2 |
| RelationField: Relate Anything in Radiance Fields | Dec 18, 2024 | 3d scene graph generationGraph Generation | CodeCode Available | 2 |
| Learnable Prompting SAM-induced Knowledge Distillation for Semi-supervised Medical Image Segmentation | Dec 18, 2024 | Image SegmentationKnowledge Distillation | CodeCode Available | 2 |
| MaskTerial: A Foundation Model for Automated 2D Material Flake Detection | Dec 12, 2024 | Instance SegmentationSemantic Segmentation | CodeCode Available | 2 |
| FAMNet: Frequency-aware Matching Network for Cross-domain Few-shot Medical Image Segmentation | Dec 12, 2024 | Cross-Domain Few-ShotDomain Generalization | CodeCode Available | 2 |
| ConDSeg: A General Medical Image Segmentation Framework via Contrast-Driven Feature Enhancement | Dec 11, 2024 | DecoderImage Segmentation | CodeCode Available | 2 |
| SegFace: Face Segmentation of Long-Tail Classes | Dec 11, 2024 | Face ParsingFace Swapping | CodeCode Available | 2 |
| DreamColour: Controllable Video Colour Editing without Training | Dec 6, 2024 | Instance SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation | Dec 5, 2024 | Semantic SegmentationTime Series | CodeCode Available | 2 |
| Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation | Dec 5, 2024 | Image SegmentationOpen Vocabulary Semantic Segmentation | CodeCode Available | 2 |
| SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning | Dec 5, 2024 | Domain AdaptationDomain Generalization | CodeCode Available | 2 |
| FLAIR: VLM with Fine-grained Language-informed Image Representations | Dec 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| 2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image Classification | Dec 1, 2024 | Computational Efficiencyimage-classification | CodeCode Available | 2 |
| TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba | Nov 26, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation | Nov 26, 2024 | Image SegmentationMedical Image Analysis | CodeCode Available | 2 |
| HyperSeg: Towards Universal Visual Segmentation with Large Language Model | Nov 26, 2024 | Language ModelingLarge Language Model | CodeCode Available | 2 |
| An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models | Nov 25, 2024 | DenoisingScene Understanding | CodeCode Available | 2 |
| Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training | Nov 25, 2024 | object-detectionObject Detection | CodeCode Available | 2 |
| Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation | Nov 24, 2024 | Semantic Segmentation | CodeCode Available | 2 |
| ResCLIP: Residual Attention for Training-free Dense Vision-language Inference | Nov 24, 2024 | AttributeSemantic Segmentation | CodeCode Available | 2 |
| IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos | Nov 18, 2024 | Pose EstimationSemantic Segmentation | CodeCode Available | 2 |
| CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation | Nov 15, 2024 | Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation | CodeCode Available | 2 |
| Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation | Nov 14, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation | Oct 30, 2024 | Domain AdaptationDomain Generalization | CodeCode Available | 2 |
| Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation | Oct 29, 2024 | Few-shot 3D Point Cloud Semantic SegmentationPoint Cloud Segmentation | CodeCode Available | 2 |
| Domain Adaptation with a Single Vision-Language Embedding | Oct 28, 2024 | Domain AdaptationOne-shot Unsupervised Domain Adaptation | CodeCode Available | 2 |
| Moving Object Segmentation in Point Cloud Data using Hidden Markov Models | Oct 24, 2024 | Semantic Segmentation | CodeCode Available | 2 |
| CARLA2Real: a tool for reducing the sim2real gap in CARLA simulator | Oct 23, 2024 | Autonomous DrivingSelf-Driving Cars | CodeCode Available | 2 |
| DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model | Oct 22, 2024 | DecoderInstance Segmentation | CodeCode Available | 2 |
| TIPS: Text-Image Pretraining with Spatial Awareness | Oct 21, 2024 | Depth EstimationImage Captioning | CodeCode Available | 2 |