| Generalized Decoding for Pixel, Image, and Language | Dec 21, 2022 | DecoderImage Segmentation | CodeCode Available | 3 |
| OneFormer: One Transformer to Rule Universal Image Segmentation | Nov 10, 2022 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 3 |
| MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model | Nov 1, 2022 | Anomaly DetectionBrain Tumor Segmentation | CodeCode Available | 3 |
| Vision Transformers: From Semantic Segmentation to Dense Prediction | Jul 19, 2022 | image-classificationImage Classification | CodeCode Available | 3 |
| XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model | Jul 14, 2022 | 2D Human Pose Estimation2D Object Detection | CodeCode Available | 3 |
| PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies | Jun 9, 2022 | 3D Classification3D Part Segmentation | CodeCode Available | 3 |
| Vision Transformer Adapter for Dense Predictions | May 17, 2022 | Instance SegmentationObject Detection | CodeCode Available | 3 |
| UNetFormer: A Unified Vision Transformer Model and Pre-Training Framework for 3D Medical Image Segmentation | Apr 1, 2022 | Brain Tumor SegmentationImage Segmentation | CodeCode Available | 3 |
| Nuclei instance segmentation and classification in histopathology images with StarDist | Mar 3, 2022 | ClassificationInstance Segmentation | CodeCode Available | 3 |
| Transformers in Medical Imaging: A Survey | Jan 24, 2022 | Image ClassificationImage Segmentation | CodeCode Available | 3 |
| XCiT: Cross-Covariance Image Transformers | Jun 17, 2021 | image-classificationImage Classification | CodeCode Available | 3 |
| Vision Transformers for Dense Prediction | Mar 24, 2021 | DecoderDepth Estimation | CodeCode Available | 3 |
| UNETR: Transformers for 3D Medical Image Segmentation | Mar 18, 2021 | 3D Medical Imaging SegmentationDecoder | CodeCode Available | 3 |
| MA-Net: A Multi-Scale Attention Network for Liver and Tumor Segmentation | Sep 21, 2020 | Image SegmentationMedical Image Segmentation | CodeCode Available | 3 |
| ResNeSt: Split-Attention Networks | Apr 19, 2020 | image-classificationImage Classification | CodeCode Available | 3 |
| FDA: Fourier Domain Adaptation for Semantic Segmentation | Apr 11, 2020 | Domain AdaptationSegmentation | CodeCode Available | 3 |
| Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform | Apr 9, 2018 | Image Super-ResolutionSemantic Segmentation | CodeCode Available | 3 |
| U-Net: Convolutional Networks for Biomedical Image Segmentation | May 18, 2015 | Cell SegmentationCell Tracking | CodeCode Available | 3 |
| Alleviating Textual Reliance in Medical Language-guided Segmentation via Prototype-driven Semantic Approximation | Jul 15, 2025 | Image SegmentationSegmentation | CodeCode Available | 2 |
| Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation Booster | Jun 22, 2025 | DecoderImage Segmentation | CodeCode Available | 2 |
| Urban1960SatSeg: Unsupervised Semantic Segmentation of Mid-20^th century Urban Landscapes with Satellite Imageries | Jun 11, 2025 | SegmentationSelf-Supervised Learning | CodeCode Available | 2 |
| Segment This Thing: Foveated Tokenization for Efficient Point-Prompted Segmentation | Jun 10, 2025 | FoveationImage Segmentation | CodeCode Available | 2 |
| VideoMolmo: Spatio-Temporal Grounding Meets Pointing | Jun 5, 2025 | Autonomous DrivingAutonomous Navigation | CodeCode Available | 2 |
| Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos | Jun 5, 2025 | GPUSemantic Segmentation | CodeCode Available | 2 |
| Simulate Any Radar: Attribute-Controllable Radar Simulation via Waveform Parameter Embedding | Jun 3, 2025 | 3D Object DetectionAttribute | CodeCode Available | 2 |
| TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models | May 29, 2025 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 2 |
| The Missing Point in Vision Transformers for Universal Image Segmentation | May 26, 2025 | Image SegmentationInstance Segmentation | CodeCode Available | 2 |
| Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration | May 26, 2025 | Domain GeneralizationHallucination | CodeCode Available | 2 |
| MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning | May 14, 2025 | Anomaly DetectionAnomaly Segmentation | CodeCode Available | 2 |
| Recent Advances in Medical Imaging Segmentation: A Survey | May 14, 2025 | Domain AdaptationFew-Shot Learning | CodeCode Available | 2 |
| Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation | May 9, 2025 | Image GenerationImage Segmentation | CodeCode Available | 2 |
| DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception | May 7, 2025 | object-detectionObject Detection | CodeCode Available | 2 |
| Rethinking Boundary Detection in Deep Learning-Based Medical Image Segmentation | May 6, 2025 | Boundary DetectionDecoder | CodeCode Available | 2 |
| Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook | May 1, 2025 | BenchmarkingChange Detection | CodeCode Available | 2 |
| Digital Twin Generation from Visual Data: A Survey | Apr 17, 2025 | Semantic SegmentationSurvey | CodeCode Available | 2 |
| The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer | Apr 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| P2Object: Single Point Supervised Object Detection and Instance Segmentation | Apr 10, 2025 | Instance SegmentationMultiple Instance Learning | CodeCode Available | 2 |
| GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation | Apr 10, 2025 | Contrastive LearningLanguage Modeling | CodeCode Available | 2 |
| Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation | Apr 8, 2025 | Domain AdaptationDomain Generalization | CodeCode Available | 2 |
| Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting | Apr 7, 2025 | Boundary DetectionObject | CodeCode Available | 2 |
| SlicerNNInteractive: A 3D Slicer extension for nnInteractive | Apr 7, 2025 | Image SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation | Apr 4, 2025 | Domain GeneralizationMamba | CodeCode Available | 2 |
| Delineate Anything: Resolution-Agnostic Field Boundary Delineation on Satellite Imagery | Apr 3, 2025 | Field Boundary DelineationInstance Segmentation | CodeCode Available | 2 |
| Scene-Centric Unsupervised Panoptic Segmentation | Apr 2, 2025 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 2 |
| A Unified Image-Dense Annotation Generation Model for Underwater Scenes | Mar 27, 2025 | Depth EstimationPrediction | CodeCode Available | 2 |
| Towards Generating Realistic 3D Semantic Training Data for Autonomous Driving | Mar 27, 2025 | 3D Semantic SegmentationAutonomous Driving | CodeCode Available | 2 |
| Exploring CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation | Mar 26, 2025 | AttributeSemantic Segmentation | CodeCode Available | 2 |
| COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting | Mar 25, 2025 | 3DGSObject | CodeCode Available | 2 |
| MaSS13K: A Matting-level Semantic Segmentation Benchmark | Mar 24, 2025 | 4kImage Matting | CodeCode Available | 2 |
| DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation | Mar 24, 2025 | 3D Semantic SegmentationLIDAR Semantic Segmentation | CodeCode Available | 2 |