| DiffAtlas: GenAI-fying Atlas Segmentation via Image-Mask Diffusion | Mar 9, 2025 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 | 5 |
| DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models | Aug 11, 2023 | Dataset GenerationDecoder | CodeCode Available | 2 | 5 |
| Mask2Former for Video Instance Segmentation | Dec 20, 2021 | Image SegmentationInstance Segmentation | CodeCode Available | 2 | 5 |
| Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting | Apr 7, 2025 | Boundary DetectionObject | CodeCode Available | 2 | 5 |
| CARLA2Real: a tool for reducing the sim2real gap in CARLA simulator | Oct 23, 2024 | Autonomous DrivingSelf-Driving Cars | CodeCode Available | 2 | 5 |
| Masked Generative Distillation | May 3, 2022 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| Dataset Quantization | Aug 21, 2023 | Dataset Distillationobject-detection | CodeCode Available | 2 | 5 |
| MaskTerial: A Foundation Model for Automated 2D Material Flake Detection | Dec 12, 2024 | Instance SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation | Jun 2, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| DAMamba: Vision State Space Model with Dynamic Adaptive Scan | Feb 18, 2025 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| MCIBI++: Soft Mining Contextual Information Beyond Image for Semantic Segmentation | Sep 9, 2022 | SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| Caltech Aerial RGB-Thermal Dataset in the Wild | Mar 13, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| DAT++: Spatially Dynamic Vision Transformer with Deformable Attention | Sep 4, 2023 | Image ClassificationInstance Segmentation | CodeCode Available | 2 | 5 |
| Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images | Mar 21, 2025 | Image SegmentationMamba | CodeCode Available | 2 | 5 |
| CAT-SAM: Conditional Tuning for Few-Shot Adaptation of Segment Anything Model | Feb 6, 2024 | DecoderImage Segmentation | CodeCode Available | 2 | 5 |
| CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation | Mar 21, 2023 | Image SegmentationOpen Vocabulary Semantic Segmentation | CodeCode Available | 2 | 5 |
| 2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds | Jul 10, 2022 | 3D Semantic SegmentationAutonomous Driving | CodeCode Available | 2 | 5 |
| Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes | Jul 16, 2024 | Human Instance SegmentationInstance Segmentation | CodeCode Available | 2 | 5 |
| MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis | Aug 14, 2024 | Anomaly DetectionBoundary Detection | CodeCode Available | 2 | 5 |
| MedUniSeg: 2D and 3D Medical Image Segmentation via a Prompt-driven Universal Model | Oct 8, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 | 5 |
| StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation | Aug 2, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive | Jan 16, 2024 | Domain GeneralizationImage Generation | CodeCode Available | 2 | 5 |
| MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions | Aug 16, 2023 | Motion Expressions Guided Video SegmentationObject | CodeCode Available | 2 | 5 |
| Cross-Image Relational Knowledge Distillation for Semantic Segmentation | Apr 14, 2022 | Knowledge DistillationSegmentation | CodeCode Available | 2 | 5 |
| MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training | Aug 3, 2022 | Instance SegmentationSegmentation | CodeCode Available | 2 | 5 |
| MIS-FM: 3D Medical Image Segmentation using Foundation Models Pretrained on a Large-Scale Unannotated Dataset | Jun 29, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 | 5 |
| CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion | Nov 26, 2022 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation | Sep 4, 2024 | Image SegmentationLesion Segmentation | CodeCode Available | 2 | 5 |
| 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks | Apr 18, 2019 | 3D Semantic Segmentation4D Spatio Temporal Semantic Segmentation | CodeCode Available | 2 | 5 |
| ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding | Oct 17, 2024 | 3D Semantic SegmentationImage Generation | CodeCode Available | 2 | 5 |
| Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation | Mar 20, 2024 | Semantic SegmentationWeakly supervised Semantic Segmentation | CodeCode Available | 2 | 5 |
| More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity | Jul 7, 2022 | Object DetectionSemantic Segmentation | CodeCode Available | 2 | 5 |
| Cross Language Image Matching for Weakly Supervised Semantic Segmentation | Mar 5, 2022 | ObjectSemantic Segmentation | CodeCode Available | 2 | 5 |
| Moving Object Segmentation in Point Cloud Data using Hidden Markov Models | Oct 24, 2024 | Semantic Segmentation | CodeCode Available | 2 | 5 |
| Customized Segment Anything Model for Medical Image Segmentation | Apr 26, 2023 | DecoderImage Segmentation | CodeCode Available | 2 | 5 |
| MSVM-UNet: Multi-Scale Vision Mamba UNet for Medical Image Segmentation | Aug 25, 2024 | Image SegmentationMamba | CodeCode Available | 2 | 5 |
| Multimodal Information Interaction for Medical Image Segmentation | Apr 25, 2024 | Heart SegmentationImage Segmentation | CodeCode Available | 2 | 5 |
| ASAM: Boosting Segment Anything Model with Adversarial Tuning | May 1, 2024 | Image Segmentationmodel | CodeCode Available | 2 | 5 |
| Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery | Mar 18, 2024 | Instance SegmentationNeRF | CodeCode Available | 2 | 5 |
| Multi-Scale Representations by Varying Window Attention for Semantic Segmentation | Apr 25, 2024 | DecoderSemantic Segmentation | CodeCode Available | 2 | 5 |
| Narrowing the semantic gaps in U-Net with learnable skip connections: The case of medical image segmentation | Dec 23, 2023 | DecoderImage Segmentation | CodeCode Available | 2 | 5 |
| Neighborhood Attention Transformer | Apr 14, 2022 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| SlicerNNInteractive: A 3D Slicer extension for nnInteractive | Apr 7, 2025 | Image SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model | Feb 5, 2024 | 3D Medical Imaging SegmentationImage Segmentation | CodeCode Available | 2 | 5 |
| DaViT: Dual Attention Vision Transformers | Apr 7, 2022 | Computational EfficiencyImage Classification | CodeCode Available | 2 | 5 |
| BEFUnet: A Hybrid CNN-Transformer Architecture for Precise Medical Image Segmentation | Feb 13, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 | 5 |
| Coordinate Attention for Efficient Mobile Network Design | Mar 4, 2021 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation | Nov 15, 2024 | Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation | CodeCode Available | 2 | 5 |
| Occlusion-Aware Instance Segmentation via BiLayer Network Architectures | Aug 8, 2022 | Human Instance SegmentationInstance Segmentation | CodeCode Available | 2 | 5 |
| BEiT: BERT Pre-Training of Image Transformers | Jun 15, 2021 | Document Image ClassificationDocument Layout Analysis | CodeCode Available | 2 | 5 |