SOTAVerified

Semantic Segmentation

Papers

Showing 901950 of 14763 papers

TitleStatusHype
DuSSS: Dual Semantic Similarity-Supervised Vision-Language Model for Semi-Supervised Medical Image SegmentationCode1
ZoRI: Towards Discriminative Zero-Shot Remote Sensing Instance SegmentationCode1
Exploring Semantic Consistency and Style Diversity for Domain Generalized Semantic SegmentationCode1
MaskCLIP++: A Mask-Based CLIP Fine-tuning Framework for Open-Vocabulary Image SegmentationCode1
MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic SegmentationCode1
DCSEG: Decoupled 3D Open-Set Segmentation using Gaussian SplattingCode1
RapidNet: Multi-Level Dilated Convolution Based Mobile BackboneCode1
Towards Open-Vocabulary Video Semantic SegmentationCode1
Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image CaptioningCode1
EOV-Seg: Efficient Open-Vocabulary Panoptic SegmentationCode1
Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image SegmentationCode1
MCP-MedSAM: A Powerful Lightweight Medical Segment Anything Model Trained with a Single GPU in Just One DayCode1
RSUniVLM: A Unified Vision Language Model for Remote Sensing via Granularity-oriented Mixture of ExpertsCode1
MRGen: Diffusion-based Controllable Data Engine for MRI Segmentation towards Unannotated ModalitiesCode1
Active Negative Loss: A Robust Framework for Learning with Noisy LabelsCode1
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-trainingCode1
Multi-Granularity Video Object SegmentationCode1
Referring Video Object Segmentation via Language-aligned Track SelectionCode1
SyncVIS: Synchronized Video Instance SegmentationCode1
Token Cropr: Faster ViTs for Quite a Few TasksCode1
TAROT: Targeted Data Selection via Optimal TransportCode1
Bootstraping Clustering of Gaussians for View-consistent 3D Scene UnderstandingCode1
Deformable Mamba for Wide Field of View SegmentationCode1
Learn from Foundation Model: Fruit Detection Model without Manual AnnotationCode1
A SAM-guided and Match-based Semi-Supervised Segmentation Framework for Medical ImagingCode1
MulModSeg: Enhancing Unpaired Multi-Modal Medical Image Segmentation with Modality-Conditioned Text Embedding and Alternating TrainingCode1
Revisiting the Integration of Convolution and Attention for Vision BackboneCode1
CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic SegmentationCode1
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic SegmentationCode1
ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural EnhancementsCode1
RETR: Multi-View Radar Detection Transformer for Indoor PerceptionCode1
OneNet: A Channel-Wise 1D Convolutional U-NetCode1
Fast and Efficient Transformer-based Method for Bird's Eye View Instance PredictionCode1
Arctique: An artificial histopathological dataset unifying realism and controllability for uncertainty quantificationCode1
ZAHA: Introducing the Level of Facade Generalization and the Large-Scale Point Cloud Facade Semantic Segmentation Benchmark DatasetCode1
Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution ShiftsCode1
LiVOS: Light Video Object Segmentation with Gated Linear MatchingCode1
Rethinking Decoders for Transformer-based Semantic Segmentation: A Compression PerspectiveCode1
Automated Classification of Cell Shapes: A Comparative Evaluation of Shape DescriptorsCode1
MLLA-UNet: Mamba-like Linear Attention in an Efficient U-Shape Model for Medical Image SegmentationCode1
Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion ModelCode1
COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered ScenesCode1
Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic SegmentationCode1
Lightweight Frequency Masker for Cross-Domain Few-Shot Semantic SegmentationCode1
IndraEye: Infrared Electro-Optical UAV-based Perception Dataset for Robust Downstream TasksCode1
Unlocking Comics: The AI4VA Dataset for Visual UnderstandingCode1
Fusion-then-Distillation: Toward Cross-modal Positive Distillation for Domain Adaptive 3D Semantic SegmentationCode1
Context-Based Visual-Language Place RecognitionCode1
Gaze-Assisted Medical Image SegmentationCode1
ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context PromptingCode1
Show:102550
← PrevPage 19 of 296Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1InternImage-H (M3I Pre-training)Params (M)1,310Unverified
2ViT-P (InternImage-H)Validation mIoU63.6Unverified
3ONE-PEACEValidation mIoU63Unverified
4InternImage-HValidation mIoU62.9Unverified
5M3I Pre-training (InternImage-H)Validation mIoU62.9Unverified
6BEiT-3Validation mIoU62.8Unverified
7EVAValidation mIoU62.3Unverified
8ViT-P (OneFormer, InternImage-H)Validation mIoU61.6Unverified
9ViT-Adapter-L (Mask2Former, BEiTv2 pretrain)Validation mIoU61.5Unverified
10FD-SwinV2-GValidation mIoU61.4Unverified