SOTAVerified

Semantic Segmentation

Papers

Showing 551600 of 14763 papers

TitleStatusHype
DiffBEV: Conditional Diffusion Model for Bird's Eye View PerceptionCode2
DiffRect: Latent Diffusion Label Rectification for Semi-supervised Medical Image SegmentationCode2
Diffusion models as plug-and-play priorsCode2
DeCLIP: Decoupled Learning for Open-Vocabulary Dense PerceptionCode2
Adversarial Supervision Makes Layout-to-Image Diffusion Models ThriveCode2
DINO in the Room: Leveraging 2D Foundation Models for 3D SegmentationCode2
Distribution-Free, Risk-Controlling Prediction SetsCode2
DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative DataCode2
Decoupling Features in Hierarchical Propagation for Video Object SegmentationCode2
DreamColour: Controllable Video Colour Editing without TrainingCode2
DSNet: A Novel Way to Use Atrous Convolutions in Semantic SegmentationCode2
DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic SegmentationCode2
DytanVO: Joint Refinement of Visual Odometry and Motion Segmentation in Dynamic EnvironmentsCode2
E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance SegmentationCode2
Agent Attention: On the Integration of Softmax and Linear AttentionCode2
Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency AdaptationCode2
Understanding the Tricks of Deep Learning in Medical Image Segmentation: Challenges and Future DirectionsCode2
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision ApplicationsCode2
DAT++: Spatially Dynamic Vision Transformer with Deformable AttentionCode2
AgileFormer: Spatially Agile Transformer UNet for Medical Image SegmentationCode2
MogaNet: Multi-order Gated Aggregation NetworkCode2
Dataset QuantizationCode2
DaViT: Dual Attention Vision TransformersCode2
DAMamba: Vision State Space Model with Dynamic Adaptive ScanCode2
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion ModelsCode2
EM-Net: Efficient Channel and Frequency Learning with Mamba for 3D Medical Image SegmentationCode2
ESP-MedSAM: Efficient Self-Prompting SAM for Universal Domain-Generalized Medical Image SegmentationCode2
Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic SegmentationCode2
A large annotated medical image dataset for the development and evaluation of segmentation algorithmsCode2
AiTLAS: Artificial Intelligence Toolbox for Earth ObservationCode2
A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and BenchmarkCode2
DDP: Diffusion Model for Dense Visual PredictionCode2
Deep Incubation: Training Large Models by Divide-and-ConqueringCode2
Fast Vision Transformers with HiLo AttentionCode2
Cross Language Image Matching for Weakly Supervised Semantic SegmentationCode2
Cross-Image Relational Knowledge Distillation for Semantic SegmentationCode2
Feature Pyramid Networks for Object DetectionCode2
FEC: Fast Euclidean Clustering for Point Cloud SegmentationCode2
Fields of The World: A Machine Learning Benchmark Dataset For Global Agricultural Field Boundary SegmentationCode2
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object SegmentationCode2
Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT ImagesCode2
CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic SegmentationCode2
FreeSOLO: Learning to Segment Objects without AnnotationsCode2
Frequency-Adaptive Dilated Convolution for Semantic SegmentationCode2
CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic SegmentationCode2
RevSAM2: Prompt SAM2 for Medical Image Segmentation via Reverse-Propagation without Fine-tuningCode2
Fully Convolutional Instance-aware Semantic SegmentationCode2
FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anythingCode2
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale AttentionCode2
Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded ScenesCode2
Show:102550
← PrevPage 12 of 296Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1InternImage-H (M3I Pre-training)Params (M)1,310Unverified
2ViT-P (InternImage-H)Validation mIoU63.6Unverified
3ONE-PEACEValidation mIoU63Unverified
4InternImage-HValidation mIoU62.9Unverified
5M3I Pre-training (InternImage-H)Validation mIoU62.9Unverified
6BEiT-3Validation mIoU62.8Unverified
7EVAValidation mIoU62.3Unverified
8ViT-P (OneFormer, InternImage-H)Validation mIoU61.6Unverified
9ViT-Adapter-L (Mask2Former, BEiTv2 pretrain)Validation mIoU61.5Unverified
10FD-SwinV2-GValidation mIoU61.4Unverified