SOTAVerified

Semantic Segmentation

Papers

Showing 101150 of 14763 papers

TitleStatusHype
CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic SegmentationCode3
EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image SegmentationCode3
Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous DrivingCode3
FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse LandscapesCode3
Moving Object Segmentation: All You Need Is SAM (and Flow)Code3
How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything ModelCode3
SegFormer3D: an Efficient Transformer for 3D Medical Image SegmentationCode3
Sigma: Siamese Mamba Network for Multi-Modal Semantic SegmentationCode3
RS-Mamba for Large Remote Sensing Image Dense PredictionCode3
UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion SegmentationCode3
Segment Any Medical Model ExtendedCode3
PlainMamba: Improving Non-Hierarchical Mamba in Visual RecognitionCode3
Segment Anything Model for Road Network Graph ExtractionCode3
PSALM: Pixelwise SegmentAtion with Large Multi-Modal ModelCode3
MTP: Advancing Remote Sensing Foundation Model via Multi-Task PretrainingCode3
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense PredictionsCode3
What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?Code3
LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image SegmentationCode3
Swin-UMamba: Mamba-based UNet with ImageNet-based pretrainingCode3
SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAMCode3
RAP-SAM: Towards Real-Time All-Purpose Segment AnythingCode3
Denoising Vision TransformersCode3
Stronger Fewer & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic SegmentationCode3
Exploring Regional Clues in CLIP for Zero-Shot Semantic SegmentationCode3
LangSplat: 3D Language Gaussian SplattingCode3
Point Transformer V3: Simpler, Faster, StrongerCode3
AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into OneCode3
Generalized Robot 3D Vision-Language Model with Fast Rendering and Pre-Training Vision-Language AlignmentCode3
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image RecognitionCode3
SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masksCode3
Putting the Object Back into Video Object SegmentationCode3
Tracking Anything with Decoupled Video SegmentationCode3
SAM-Med2DCode3
VideoCutLER: Surprisingly Simple Unsupervised Video Instance SegmentationCode3
Quantifying the robustness of deep multispectral segmentation models against natural perturbations and data poisoningCode3
ONE-PEACE: Exploring One General Representation Model Toward Unlimited ModalitiesCode3
Personalize Segment Anything Model with One ShotCode3
Medical SAM Adapter: Adapting Segment Anything Model for Medical Image SegmentationCode3
Anything-3D: Towards Single-view Anything Reconstruction in the WildCode3
SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and MoreCode3
Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual TasksCode3
FastViT: A Fast Hybrid Vision Transformer using Structural ReparameterizationCode3
PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360^Code3
A Simple Framework for Open-Vocabulary Segmentation and DetectionCode3
Universal Instance Perception as Object Discovery and RetrievalCode3
Cut and Learn for Unsupervised Object Detection and Instance SegmentationCode3
MedSegDiff-V2: Diffusion based Medical Image Segmentation with TransformerCode3
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked ModelingCode3
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked AutoencodersCode3
PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360degCode3
Show:102550
← PrevPage 3 of 296Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1InternImage-H (M3I Pre-training)Params (M)1,310Unverified
2ViT-P (InternImage-H)Validation mIoU63.6Unverified
3ONE-PEACEValidation mIoU63Unverified
4InternImage-HValidation mIoU62.9Unverified
5M3I Pre-training (InternImage-H)Validation mIoU62.9Unverified
6BEiT-3Validation mIoU62.8Unverified
7EVAValidation mIoU62.3Unverified
8ViT-P (OneFormer, InternImage-H)Validation mIoU61.6Unverified
9ViT-Adapter-L (Mask2Former, BEiTv2 pretrain)Validation mIoU61.5Unverified
10FD-SwinV2-GValidation mIoU61.4Unverified