| Image Segmentation Keras : Implementation of Segnet, FCN, UNet, PSPNet and other models in Keras | Jul 25, 2023 | Image SegmentationSegmentation | CodeCode Available | 4 |
| Semantic-SAM: Segment and Recognize Anything at Any Granularity | Jul 10, 2023 | Image SegmentationSegmentation | CodeCode Available | 4 |
| The Segment Anything Model (SAM) for Remote Sensing Applications: From Zero to One Shot | Jun 29, 2023 | Image SegmentationSemantic Segmentation | CodeCode Available | 4 |
| SSL4EO-L: Datasets and Foundation Models for Landsat Imagery | Jun 15, 2023 | Cloud DetectionEarth Observation | CodeCode Available | 4 |
| Segment Anything in Medical Images | Apr 24, 2023 | DiagnosticImage Segmentation | CodeCode Available | 4 |
| SegGPT: Segmenting Everything In Context | Apr 6, 2023 | Few-Shot Semantic SegmentationIn-Context Learning | CodeCode Available | 4 |
| InceptionNeXt: When Inception Meets ConvNeXt | Mar 29, 2023 | Image ClassificationSemantic Segmentation | CodeCode Available | 4 |
| RTMDet: An Empirical Study of Designing Real-Time Object Detectors | Dec 14, 2022 | GPUInstance Segmentation | CodeCode Available | 4 |
| Images Speak in Images: A Generalist Painter for In-Context Visual Learning | Dec 5, 2022 | In-Context LearningKeypoint Detection | CodeCode Available | 4 |
| InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions | Nov 10, 2022 | 2D Object DetectionClassification | CodeCode Available | 4 |
| SiamMask: A Framework for Fast Online Object Tracking and Segmentation | Jul 5, 2022 | Multiple Object TrackingObject | CodeCode Available | 4 |
| GLIPv2: Unifying Localization and Vision-Language Understanding | Jun 12, 2022 | 2D Object DetectionContrastive Learning | CodeCode Available | 4 |
| Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation | Jun 6, 2022 | Image SegmentationInstance Segmentation | CodeCode Available | 4 |
| EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction | May 29, 2022 | Autonomous DrivingCPU | CodeCode Available | 4 |
| Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN | May 27, 2022 | Image ClassificationInstance Segmentation | CodeCode Available | 4 |
| Highly Accurate Dichotomous Image Segmentation | Mar 6, 2022 | 2k3D Reconstruction | CodeCode Available | 4 |
| Visual Attention Network | Feb 20, 2022 | image-classificationImage Classification | CodeCode Available | 4 |
| Detectron2 Object Detection & Manipulating Images using Cartoonization | Aug 1, 2021 | Autonomous VehiclesData Visualization | CodeCode Available | 4 |
| Panoptic Feature Pyramid Networks | Jan 8, 2019 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 4 |
| Deep Residual Learning for Image Recognition | Dec 10, 2015 | Classification | CodeCode Available | 4 |
| No time to train! Training-Free Reference-Based Instance Segmentation | Jul 3, 2025 | Cross-Domain Few-Shot Object DetectionFew-Shot Object Detection | CodeCode Available | 3 |
| DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation | Apr 7, 2025 | 3D geometryRGBD Semantic Segmentation | CodeCode Available | 3 |
| UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface | Mar 3, 2025 | Instance SegmentationReasoning Segmentation | CodeCode Available | 3 |
| DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks | Feb 24, 2025 | Conditional Image GenerationImage Generation | CodeCode Available | 3 |
| ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features | Feb 6, 2025 | Image SegmentationSegmentation | CodeCode Available | 3 |
| Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models | Jan 30, 2025 | Action RecognitionDomain Adaptation | CodeCode Available | 3 |
| How Well Do Supervised 3D Models Transfer to Medical Imaging Tasks? | Jan 20, 2025 | Computed Tomography (CT)GPU | CodeCode Available | 3 |
| SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation | Dec 16, 2024 | DecoderSemantic Segmentation | CodeCode Available | 3 |
| SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation | Nov 26, 2024 | DiversityImage Segmentation | CodeCode Available | 3 |
| SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation | Nov 26, 2024 | Natural Language UnderstandingReferring Video Object Segmentation | CodeCode Available | 3 |
| Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline | Nov 19, 2024 | Image SegmentationInteractive Segmentation | CodeCode Available | 3 |
| SMITE: Segment Me In TimE | Oct 24, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 3 |
| UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation | Oct 14, 2024 | Semantic SegmentationSemi-supervised Change Detection | CodeCode Available | 3 |
| Rethinking the Evaluation of Visible and Infrared Image Fusion | Oct 9, 2024 | object-detectionObject Detection | CodeCode Available | 3 |
| SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images | Oct 2, 2024 | Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation | CodeCode Available | 3 |
| Breaking reCAPTCHAv2 | Sep 13, 2024 | Image SegmentationSemantic Segmentation | CodeCode Available | 3 |
| InstanSeg: an embedding-based instance segmentation algorithm optimized for accurate, efficient and portable cell segmentation | Aug 28, 2024 | Cell SegmentationGPU | CodeCode Available | 3 |
| A Survey of Camouflaged Object Detection and Beyond | Aug 26, 2024 | Instance SegmentationObject | CodeCode Available | 3 |
| A Short Review and Evaluation of SAM2's Performance in 3D CT Image Segmentation | Aug 20, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 3 |
| SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation | Aug 16, 2024 | Image SegmentationMarine Animal Segmentation | CodeCode Available | 3 |
| 5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks | Aug 15, 2024 | image-classificationImage Classification | CodeCode Available | 3 |
| TCFormer: Visual Recognition via Token Clustering Transformer | Jul 16, 2024 | Clusteringimage-classification | CodeCode Available | 3 |
| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 |
| xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart | Jul 1, 2024 | 3D Medical Imaging Segmentationimage-classification | CodeCode Available | 3 |
| Segment Anything without Supervision | Jun 28, 2024 | ClusteringImage Segmentation | CodeCode Available | 3 |
| Point-SAM: Promptable 3D Segmentation Model for Point Clouds | Jun 25, 2024 | Image SegmentationSegmentation | CodeCode Available | 3 |
| RobustSAM: Segment Anything Robustly on Degraded Images | Jun 13, 2024 | DeblurringImage Dehazing | CodeCode Available | 3 |
| Merlin: A Vision Language Foundation Model for 3D Computed Tomography | Jun 10, 2024 | 3D Semantic SegmentationComputed Tomography (CT) | CodeCode Available | 3 |
| VISTA3D: Versatile Imaging SegmenTation and Annotation model for 3D Computed Tomography | Jun 7, 2024 | Computed Tomography (CT)Image Segmentation | CodeCode Available | 3 |
| Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation | Jun 4, 2024 | 2D Object Detection3D Instance Segmentation | CodeCode Available | 3 |