SOTAVerified

Open Vocabulary Semantic Segmentation

Papers

Showing 51100 of 113 papers

TitleStatusHype
Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language ModelsCode1
ReME: A Data-Centric Framework for Training-Free Open-Vocabulary SegmentationCode1
SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic SegmentationCode1
SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic SegmentationCode1
Auto-Vocabulary Semantic SegmentationCode1
Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic SegmentationCode1
Show or Tell? A Benchmark To Evaluate Visual and Textual Prompts in Semantic SegmentationCode1
TagAlign: Improving Vision-Language Alignment with Multi-Tag ClassificationCode1
TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic SegmentationCode1
TAG: Guidance-free Open-Vocabulary Semantic SegmentationCode1
Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic SegmentationCode1
TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag BiasCode1
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World UnderstandingCode1
Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic SegmentationCode1
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic SegmentationCode1
A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language ModelCode1
OVOSE: Open-Vocabulary Semantic Segmentation in Event-Based CamerasCode0
AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-AggregationCode0
Exploring Open-Vocabulary Semantic Segmentation from CLIP Vision Encoder Distillation OnlyCode0
Test-Time Optimization for Domain Adaptive Open Vocabulary SegmentationCode0
DINOv2 Meets Text: A Unified Framework for Image- and Pixel-Level Vision-Language AlignmentCode0
A Language-Guided Benchmark for Weakly Supervised Open Vocabulary Semantic SegmentationCode0
MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic SegmentationCode0
OVGaussian: Generalizable 3D Gaussian Segmentation with Open VocabulariesCode0
Test-time Contrastive Concepts for Open-world Semantic Segmentation0
SILC: Improving Vision Language Pretraining with Self-Distillation0
Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation0
CUS3D :CLIP-based Unsupervised 3D Segmentation via Object-level Denoise0
From Open-Vocabulary to Vocabulary-Free Semantic Segmentation0
Generalizable Semantic Vision Query Generation for Zero-shot Panoptic and Semantic Segmentation0
Rethinking the Global Knowledge of CLIP in Training-Free Open-Vocabulary Semantic Segmentation0
Personalized OVSS: Understanding Personal Concept in Open-Vocabulary Semantic Segmentation0
EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding0
Image-to-Image Matching via Foundation Models: A New Perspective for Open-Vocabulary Semantic Segmentation0
Incorporating Feature Pyramid Tokenization and Open Vocabulary Semantic Segmentation0
Dual Semantic Guidance for Open Vocabulary Semantic Segmentation0
InvSeg: Test-Time Prompt Inversion for Semantic Segmentation0
ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference0
Class Enhancement Losses with Pseudo Labels for Zero-shot Semantic Segmentation0
RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Radiology with Zero-Shot Multi-Task Capability0
Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation0
Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors0
LMSeg: Unleashing the Power of Large-Scale Models for Open-Vocabulary Semantic Segmentation0
Understanding Fine-tuning CLIP for Open-vocabulary Semantic Segmentation in Hyperbolic Space0
DPSeg: Dual-Prompt Cost Volume Learning for Open-Vocabulary Semantic Segmentation0
3D Vision-Language Gaussian Splatting0
MROVSeg: Breaking the Resolution Curse of Vision-Language Models in Open-Vocabulary Image Segmentation0
Multi-Grained Cross-modal Alignment for Learning Open-vocabulary Semantic Segmentation from Text Supervision0
Multi-robot autonomous 3D reconstruction using Gaussian splatting with Semantic guidance0
MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation0
Show:102550
← PrevPage 2 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1HyperSegmIoU64.6Unverified
2SILCmIoU63.5Unverified
3CAT-SegmIoU63.3Unverified
4MaskCLIP++mIoU62.5Unverified
5CLIPSelfmIoU62.3Unverified
6UMG-CLIP-L/14mIoU61Unverified
7SEDmIoU60.6Unverified
8Mask-AdaptermIoU60.4Unverified
9EBSeg-LmIoU60.2Unverified
10MAFT+mIoU59.4Unverified
#ModelMetricClaimedVerifiedStatus
1UMG-CLIP-E/14mIoU38.2Unverified
2MaskCLIP++mIoU38.2Unverified
3Mask-AdaptermIoU38.2Unverified
4CAT-SegmIoU37.9Unverified
5SILCmIoU37.7Unverified
6UMG-CLIP-L/14mIoU36.1Unverified
7MAFT+mIoU36.1Unverified
8OVSeg + OpenDASmIoU35.8Unverified
9SEDmIoU35.2Unverified
10CLIPSelfmIoU34.5Unverified
#ModelMetricClaimedVerifiedStatus
1UMG-CLIP-E/14mIoU17.3Unverified
2MaskCLIP++mIoU16.8Unverified
3Mask-AdaptermIoU16.2Unverified
4CAT-SegmIoU16Unverified
5UMG-CLIP-L/14mIoU15.4Unverified
6MAFT+mIoU15.1Unverified
7SILCmIoU15Unverified
8PosSAMmIoU14.9Unverified
9FC-CLIPmIoU14.8Unverified
10SCANmIoU14Unverified
#ModelMetricClaimedVerifiedStatus
1UMG-CLIP-L/14mIoU97.9Unverified
2SILCmIoU97.6Unverified
3SCANmIoU97.2Unverified
4CAT-SegmIoU97Unverified
5MaskCLIP++mIoU96.8Unverified
6MAFT+mIoU96.5Unverified
7EBSeg-LmIoU96.4Unverified
8FC-CLIPmIoU95.4Unverified
9OVSeg Swin-BmIoU94.5Unverified
10HyperSegmIoU92.1Unverified
#ModelMetricClaimedVerifiedStatus
1SILCmIoU25.8Unverified
2UMG-CLIP-E/14mIoU25.2Unverified
3MaskCLIP++mIoU23.9Unverified
4CAT-SegmIoU23.8Unverified
5UMG-CLIP-L/14mIoU23.2Unverified
6Mask-AdaptermIoU22.7Unverified
7SEDmIoU22.6Unverified
8MAFT+mIoU21.6Unverified
9EBSeg-LmIoU21Unverified
10FC-CLIPmIoU18.2Unverified
#ModelMetricClaimedVerifiedStatus
1POMPHIoU39.1Unverified
2ZSSegHIoU37.8Unverified
3ZegFormerHIoU34.8Unverified
4TTD (TCL)mIoU23.7Unverified
5LaVGmIoU23.2Unverified
6CLIP Surgery (original CLIP without any fine-tuning)mIoU21.9Unverified
7TTD (MaskCLIP)mIoU19.4Unverified
#ModelMetricClaimedVerifiedStatus
1FC-CLIPmIoU56.2Unverified
2SimSegmIoU34.5Unverified
3TTD (TCL)mIoU32Unverified
4CLIP Surgery (CLIP without any fine-tuning)mIoU31.4Unverified
5TTD (MaskCLIP)mIoU27Unverified
#ModelMetricClaimedVerifiedStatus
1UMG-CLIP-E/14mIoU85.4Unverified
2CAT-SegmIoU82.5Unverified
3SILCmIoU82.5Unverified
4FC-CLIPmIoU81.8Unverified
#ModelMetricClaimedVerifiedStatus
1SkySense-OmIoU-43.9Unverified
2SegEarth-OVmIoU-21.7Unverified
#ModelMetricClaimedVerifiedStatus
1PACLmIoU38.8Unverified
#ModelMetricClaimedVerifiedStatus
1SkySense-OmIoU8.3Unverified
#ModelMetricClaimedVerifiedStatus
1SkySense-OmIoU54.1Unverified
#ModelMetricClaimedVerifiedStatus
1SkySense-OmIoU30.89Unverified
#ModelMetricClaimedVerifiedStatus
1SkySense-OmIoU32.12Unverified