SOTAVerified

Semantic Segmentation

Papers

Showing 24762500 of 14763 papers

TitleStatusHype
DenoiseRep: Denoising Model for Representation LearningCode1
4M-21: An Any-to-Any Vision Model for Tens of Tasks and ModalitiesCode5
Instance-level quantitative saliency in multiple sclerosis lesion segmentationCode0
A Labeled Array Distance Metric for Measuring Image Segmentation Quality0
Generalizable Disaster Damage Assessment via Change Detection with Vision Foundation Model0
Dataset Enhancement with Instance-Level AugmentationsCode1
SimSAM: Simple Siamese Representations Based Semantic Affinity Matrix for Unsupervised Image SegmentationCode0
APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation0
A^2-MAE: A spatial-temporal-spectral unified remote sensing pre-training method based on anchor-aware masked autoencoder0
GRU-Net: Gaussian Attention Aided Dense Skip Connection Based MultiResUNet for Breast Histopathology Image SegmentationCode0
2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation0
Real2Code: Reconstruct Articulated Objects via Code Generation0
RMem: Restricted Memory Banks Improve Video Object Segmentation0
Small Scale Data-Free Knowledge DistillationCode1
OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained UnderstandingCode1
Spatial-Frequency Dual Progressive Attention Network For Medical Image SegmentationCode1
A Semantic-Aware and Multi-Guided Network for Infrared-Visible Image FusionCode0
Watching Swarm Dynamics from Above: A Framework for Advanced Object Tracking in Drone Videos0
LiSD: An Efficient Multi-Task Learning Framework for LiDAR Segmentation and Detection0
Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph0
PanoSSC: Exploring Monocular Panoptic 3D Scene Reconstruction for Autonomous Driving0
1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video SegmentationCode1
Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach0
Dual Thinking and Logical Processing -- Are Multi-modal Large Language Models Closing the Gap with Human Vision ?Code0
UVIS: Unsupervised Video Instance Segmentation0
Show:102550
← PrevPage 100 of 591Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1InternImage-H (M3I Pre-training)Params (M)1,310Unverified
2ViT-P (InternImage-H)Validation mIoU63.6Unverified
3ONE-PEACEValidation mIoU63Unverified
4InternImage-HValidation mIoU62.9Unverified
5M3I Pre-training (InternImage-H)Validation mIoU62.9Unverified
6BEiT-3Validation mIoU62.8Unverified
7EVAValidation mIoU62.3Unverified
8ViT-P (OneFormer, InternImage-H)Validation mIoU61.6Unverified
9ViT-Adapter-L (Mask2Former, BEiTv2 pretrain)Validation mIoU61.5Unverified
10FD-SwinV2-GValidation mIoU61.4Unverified