| CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation | May 17, 2024 | DecoderMamba | CodeCode Available | 3 |
| EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation | May 11, 2024 | Computational EfficiencyDecoder | CodeCode Available | 3 |
| Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving | May 8, 2024 | Autonomous DrivingLIDAR Semantic Segmentation | CodeCode Available | 3 |
| FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes | May 7, 2024 | 3D Point Cloud Classification3D Semantic Segmentation | CodeCode Available | 3 |
| Moving Object Segmentation: All You Need Is SAM (and Flow) | Apr 18, 2024 | AllMotion Segmentation | CodeCode Available | 3 |
| How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything Model | Apr 15, 2024 | DecoderImage Segmentation | CodeCode Available | 3 |
| SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation | Apr 15, 2024 | Brain Tumor SegmentationDecoder | CodeCode Available | 3 |
| Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation | Apr 5, 2024 | DecoderMamba | CodeCode Available | 3 |
| RS-Mamba for Large Remote Sensing Image Dense Prediction | Apr 3, 2024 | Building change detection for remote sensing imagesChange Detection | CodeCode Available | 3 |
| UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation | Mar 29, 2024 | Image SegmentationLesion Segmentation | CodeCode Available | 3 |
| Segment Any Medical Model Extended | Mar 26, 2024 | Data AugmentationImage Segmentation | CodeCode Available | 3 |
| PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition | Mar 26, 2024 | Image ClassificationInstance Segmentation | CodeCode Available | 3 |
| Segment Anything Model for Road Network Graph Extraction | Mar 24, 2024 | Graph LearningGraph Neural Network | CodeCode Available | 3 |
| PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model | Mar 21, 2024 | DecoderGeneralized Referring Expression Segmentation | CodeCode Available | 3 |
| MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Mar 20, 2024 | Aerial Scene ClassificationBuilding change detection for remote sensing images | CodeCode Available | 3 |
| ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions | Mar 13, 2024 | Instance SegmentationObject Detection | CodeCode Available | 3 |
| What Matters When Repurposing Diffusion Models for General Dense Perception Tasks? | Mar 10, 2024 | Depth EstimationImage Matting | CodeCode Available | 3 |
| LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation | Mar 8, 2024 | Image SegmentationMamba | CodeCode Available | 3 |
| Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining | Feb 5, 2024 | Image SegmentationMamba | CodeCode Available | 3 |
| SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM | Feb 5, 2024 | 3D Semantic SegmentationCamera Pose Estimation | CodeCode Available | 3 |
| RAP-SAM: Towards Real-Time All-Purpose Segment Anything | Jan 18, 2024 | AllDecoder | CodeCode Available | 3 |
| Denoising Vision Transformers | Jan 5, 2024 | DenoisingDepth Estimation | CodeCode Available | 3 |
| Stronger Fewer & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation | Jan 1, 2024 | Domain GeneralizationSemantic Segmentation | CodeCode Available | 3 |
| Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation | Jan 1, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 3 |
| LangSplat: 3D Language Gaussian Splatting | Dec 26, 2023 | NeRFObject Localization | CodeCode Available | 3 |
| Point Transformer V3: Simpler, Faster, Stronger | Dec 15, 2023 | 3D Semantic SegmentationLIDAR Semantic Segmentation | CodeCode Available | 3 |
| AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One | Dec 10, 2023 | AllBenchmarking | CodeCode Available | 3 |
| Generalized Robot 3D Vision-Language Model with Fast Rendering and Pre-Training Vision-Language Alignment | Dec 1, 2023 | Contrastive LearningFew-Shot Learning | CodeCode Available | 3 |
| UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition | Nov 27, 2023 | Image ClassificationObject Detection | CodeCode Available | 3 |
| SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masks | Nov 20, 2023 | DiversityImage Segmentation | CodeCode Available | 3 |
| Putting the Object Back into Video Object Segmentation | Oct 19, 2023 | ObjectSegmentation | CodeCode Available | 3 |
| Tracking Anything with Decoupled Video Segmentation | Sep 7, 2023 | Open-Vocabulary Video SegmentationOpen-World Video Segmentation | CodeCode Available | 3 |
| SAM-Med2D | Aug 30, 2023 | DecoderImage Segmentation | CodeCode Available | 3 |
| VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation | Aug 28, 2023 | Instance SegmentationOptical Flow Estimation | CodeCode Available | 3 |
| Quantifying the robustness of deep multispectral segmentation models against natural perturbations and data poisoning | May 18, 2023 | Adversarial RobustnessData Poisoning | CodeCode Available | 3 |
| ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities | May 18, 2023 | 1 Image, 2*2 StitchiAction Classification | CodeCode Available | 3 |
| Personalize Segment Anything Model with One Shot | May 4, 2023 | Image Generationmodel | CodeCode Available | 3 |
| Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation | Apr 25, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 3 |
| Anything-3D: Towards Single-view Anything Reconstruction in the Wild | Apr 19, 2023 | 3D ReconstructionDiversity | CodeCode Available | 3 |
| SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More | Apr 18, 2023 | General KnowledgeImage Segmentation | CodeCode Available | 3 |
| Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks | Mar 30, 2023 | Human ParsingPedestrian Attribute Recognition | CodeCode Available | 3 |
| FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization | Mar 24, 2023 | 3D Hand Pose EstimationGPU | CodeCode Available | 3 |
| PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360^ | Mar 23, 2023 | Image GenerationImage Segmentation | CodeCode Available | 3 |
| A Simple Framework for Open-Vocabulary Segmentation and Detection | Mar 14, 2023 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 3 |
| Universal Instance Perception as Object Discovery and Retrieval | Mar 12, 2023 | Described Object DetectionGeneralized Referring Expression Comprehension | CodeCode Available | 3 |
| Cut and Learn for Unsupervised Object Detection and Instance Segmentation | Jan 26, 2023 | Instance Segmentationobject-detection | CodeCode Available | 3 |
| MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer | Jan 19, 2023 | Image GenerationImage Segmentation | CodeCode Available | 3 |
| Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling | Jan 9, 2023 | 2D Object DetectionContrastive Learning | CodeCode Available | 3 |
| ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders | Jan 2, 2023 | Object DetectionRepresentation Learning | CodeCode Available | 3 |
| PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360deg | Jan 1, 2023 | Image GenerationImage Segmentation | CodeCode Available | 3 |