| Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images | Mar 21, 2025 | Image SegmentationMamba | CodeCode Available | 2 | 5 |
| Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors | Mar 24, 2022 | Image GenerationSemantic Segmentation | CodeCode Available | 2 | 5 |
| MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model | Apr 19, 2024 | ObjectSemantic Segmentation | CodeCode Available | 2 | 5 |
| Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes | Jul 16, 2024 | Human Instance SegmentationInstance Segmentation | CodeCode Available | 2 | 5 |
| Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting | Apr 7, 2025 | Boundary DetectionObject | CodeCode Available | 2 | 5 |
| Mask2Former for Video Instance Segmentation | Dec 20, 2021 | Image SegmentationInstance Segmentation | CodeCode Available | 2 | 5 |
| Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation | Dec 5, 2024 | Image SegmentationOpen Vocabulary Semantic Segmentation | CodeCode Available | 2 | 5 |
| DAMamba: Vision State Space Model with Dynamic Adaptive Scan | Feb 18, 2025 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| Mask-Free Video Instance Segmentation | Mar 28, 2023 | Instance SegmentationOptical Flow Estimation | CodeCode Available | 2 | 5 |
| MaskTerial: A Foundation Model for Automated 2D Material Flake Detection | Dec 12, 2024 | Instance SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching | May 22, 2023 | AllFew-Shot Semantic Segmentation | CodeCode Available | 2 | 5 |
| CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale Attention | Mar 13, 2023 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| Caltech Aerial RGB-Thermal Dataset in the Wild | Mar 13, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| 2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds | Jul 10, 2022 | 3D Semantic SegmentationAutonomous Driving | CodeCode Available | 2 | 5 |
| Cross-Image Relational Knowledge Distillation for Semantic Segmentation | Apr 14, 2022 | Knowledge DistillationSegmentation | CodeCode Available | 2 | 5 |
| Medical Image Segmentation with Domain Adaptation: A Survey | Nov 3, 2023 | Domain AdaptationImage Segmentation | CodeCode Available | 2 | 5 |
| CARLA2Real: a tool for reducing the sim2real gap in CARLA simulator | Oct 23, 2024 | Autonomous DrivingSelf-Driving Cars | CodeCode Available | 2 | 5 |
| MedLSAM: Localize and Segment Anything Model for 3D CT Images | Jun 26, 2023 | Image SegmentationMedical Image Analysis | CodeCode Available | 2 | 5 |
| CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation | Nov 15, 2024 | Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation | CodeCode Available | 2 | 5 |
| 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks | Apr 18, 2019 | 3D Semantic Segmentation4D Spatio Temporal Semantic Segmentation | CodeCode Available | 2 | 5 |
| StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation | Aug 2, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive | Jan 16, 2024 | Domain GeneralizationImage Generation | CodeCode Available | 2 | 5 |
| ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding | Oct 17, 2024 | 3D Semantic SegmentationImage Generation | CodeCode Available | 2 | 5 |
| MetaFormer Is Actually What You Need for Vision | Nov 22, 2021 | Image ClassificationObject Detection | CodeCode Available | 2 | 5 |
| CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation | Oct 30, 2024 | Domain AdaptationDomain Generalization | CodeCode Available | 2 | 5 |