| Personalize Segment Anything Model with One Shot | May 4, 2023 | Image Generationmodel | CodeCode Available | 3 | 5 |
| RS-Mamba for Large Remote Sensing Image Dense Prediction | Apr 3, 2024 | Building change detection for remote sensing imagesChange Detection | CodeCode Available | 3 | 5 |
| PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies | Jun 9, 2022 | 3D Classification3D Part Segmentation | CodeCode Available | 3 | 5 |
| SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation | Aug 16, 2024 | Image SegmentationMarine Animal Segmentation | CodeCode Available | 3 | 5 |
| Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models | Jan 30, 2025 | Action RecognitionDomain Adaptation | CodeCode Available | 3 | 5 |
| SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More | Apr 18, 2023 | General KnowledgeImage Segmentation | CodeCode Available | 3 | 5 |
| PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360deg | Jan 1, 2023 | Image GenerationImage Segmentation | CodeCode Available | 3 | 5 |
| SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images | Oct 2, 2024 | Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation | CodeCode Available | 3 | 5 |
| Point-SAM: Promptable 3D Segmentation Model for Point Clouds | Jun 25, 2024 | Image SegmentationSegmentation | CodeCode Available | 3 | 5 |
| Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation | Jun 4, 2024 | 2D Object Detection3D Instance Segmentation | CodeCode Available | 3 | 5 |
| ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities | May 18, 2023 | 1 Image, 2*2 StitchiAction Classification | CodeCode Available | 3 | 5 |
| Nuclei instance segmentation and classification in histopathology images with StarDist | Mar 3, 2022 | ClassificationInstance Segmentation | CodeCode Available | 3 | 5 |
| OneFormer: One Transformer to Rule Universal Image Segmentation | Nov 10, 2022 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 3 | 5 |
| PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360^ | Mar 23, 2023 | Image GenerationImage Segmentation | CodeCode Available | 3 | 5 |
| Point Transformer V3: Simpler, Faster, Stronger | Dec 15, 2023 | 3D Semantic SegmentationLIDAR Semantic Segmentation | CodeCode Available | 3 | 5 |
| MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Mar 20, 2024 | Aerial Scene ClassificationBuilding change detection for remote sensing images | CodeCode Available | 3 | 5 |
| Breaking reCAPTCHAv2 | Sep 13, 2024 | Image SegmentationSemantic Segmentation | CodeCode Available | 3 | 5 |
| Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation | Apr 5, 2024 | DecoderMamba | CodeCode Available | 3 | 5 |
| Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving | May 8, 2024 | Autonomous DrivingLIDAR Semantic Segmentation | CodeCode Available | 3 | 5 |
| Stronger Fewer & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation | Jan 1, 2024 | Domain GeneralizationSemantic Segmentation | CodeCode Available | 3 | 5 |
| Merlin: A Vision Language Foundation Model for 3D Computed Tomography | Jun 10, 2024 | 3D Semantic SegmentationComputed Tomography (CT) | CodeCode Available | 3 | 5 |
| TCFormer: Visual Recognition via Token Clustering Transformer | Jul 16, 2024 | Clusteringimage-classification | CodeCode Available | 3 | 5 |
| Moving Object Segmentation: All You Need Is SAM (and Flow) | Apr 18, 2024 | AllMotion Segmentation | CodeCode Available | 3 | 5 |
| No time to train! Training-Free Reference-Based Instance Segmentation | Jul 3, 2025 | Cross-Domain Few-Shot Object DetectionFew-Shot Object Detection | CodeCode Available | 3 | 5 |
| Transformers in Medical Imaging: A Survey | Jan 24, 2022 | Image ClassificationImage Segmentation | CodeCode Available | 3 | 5 |
| UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface | Mar 3, 2025 | Instance SegmentationReasoning Segmentation | CodeCode Available | 3 | 5 |
| Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation | Apr 25, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 3 | 5 |
| MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model | Nov 1, 2022 | Anomaly DetectionBrain Tumor Segmentation | CodeCode Available | 3 | 5 |
| MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer | Jan 19, 2023 | Image GenerationImage Segmentation | CodeCode Available | 3 | 5 |
| PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model | Mar 21, 2024 | DecoderGeneralized Referring Expression Segmentation | CodeCode Available | 3 | 5 |
| Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks | Mar 30, 2023 | Human ParsingPedestrian Attribute Recognition | CodeCode Available | 3 | 5 |
| Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline | Nov 19, 2024 | Image SegmentationInteractive Segmentation | CodeCode Available | 3 | 5 |
| InstanSeg: an embedding-based instance segmentation algorithm optimized for accurate, efficient and portable cell segmentation | Aug 28, 2024 | Cell SegmentationGPU | CodeCode Available | 3 | 5 |
| LangSplat: 3D Language Gaussian Splatting | Dec 26, 2023 | NeRFObject Localization | CodeCode Available | 3 | 5 |
| How Well Do Supervised 3D Models Transfer to Medical Imaging Tasks? | Jan 20, 2025 | Computed Tomography (CT)GPU | CodeCode Available | 3 | 5 |
| How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything Model | Apr 15, 2024 | DecoderImage Segmentation | CodeCode Available | 3 | 5 |
| LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation | Mar 8, 2024 | Image SegmentationMamba | CodeCode Available | 3 | 5 |
| A Simple Framework for Open-Vocabulary Segmentation and Detection | Mar 14, 2023 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 3 | 5 |
| A Short Review and Evaluation of SAM2's Performance in 3D CT Image Segmentation | Aug 20, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 3 | 5 |
| FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization | Mar 24, 2023 | 3D Hand Pose EstimationGPU | CodeCode Available | 3 | 5 |
| FDA: Fourier Domain Adaptation for Semantic Segmentation | Apr 11, 2020 | Domain AdaptationSegmentation | CodeCode Available | 3 | 5 |
| FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes | May 7, 2024 | 3D Point Cloud Classification3D Semantic Segmentation | CodeCode Available | 3 | 5 |
| Anything-3D: Towards Single-view Anything Reconstruction in the Wild | Apr 19, 2023 | 3D ReconstructionDiversity | CodeCode Available | 3 | 5 |
| Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation | Jan 1, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 3 | 5 |
| EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation | May 11, 2024 | Computational EfficiencyDecoder | CodeCode Available | 3 | 5 |
| Generalized Decoding for Pixel, Image, and Language | Dec 21, 2022 | DecoderImage Segmentation | CodeCode Available | 3 | 5 |
| AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One | Dec 10, 2023 | AllBenchmarking | CodeCode Available | 3 | 5 |
| DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation | Apr 7, 2025 | 3D geometryRGBD Semantic Segmentation | CodeCode Available | 3 | 5 |
| DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks | Feb 24, 2025 | Conditional Image GenerationImage Generation | CodeCode Available | 3 | 5 |
| A Survey of Camouflaged Object Detection and Beyond | Aug 26, 2024 | Instance SegmentationObject | CodeCode Available | 3 | 5 |