SOTAVerified

Semantic Segmentation

Papers

Showing 150 of 14763 papers

TitleStatusHype
SAM 2: Segment Anything in Images and VideosCode11
YOLO-World: Real-Time Open-Vocabulary Object DetectionCode9
Depth Anything: Unleashing the Power of Large-Scale Unlabeled DataCode9
MambaOut: Do We Really Need Mamba for Vision?Code7
Efficient Track AnythingCode7
Bilateral Reference for High-Resolution Dichotomous Image SegmentationCode7
MambaVision: A Hybrid Mamba-Transformer Vision BackboneCode7
Efficient MedSAMs: Segment Anything in Medical Images on LaptopCode7
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and ResolutionCode6
DINOv2: Learning Robust Visual Features without SupervisionCode6
U-Net v2: Rethinking the Skip Connections of U-Net for Medical Image SegmentationCode6
PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing ImageryCode5
4M-21: An Any-to-Any Vision Model for Tens of Tasks and ModalitiesCode5
4th PVUW MeViS 3rd Place Report: Sa2VACode5
FeatUp: A Model-Agnostic Framework for Features at Any ResolutionCode5
Infinite Photorealistic Worlds using Procedural GenerationCode5
Matching Anything by Segmenting AnythingCode5
OMG-Seg: Is One Model Good Enough For All Segmentation?Code5
A ConvNet for the 2020sCode5
Faster Segment Anything: Towards Lightweight SAM for Mobile ApplicationsCode5
Track Anything: Segment Anything Meets VideosCode5
Unleashing the Potential of SAM2 for Biomedical Images and Videos: A SurveyCode5
Segment Anything for Videos: A Systematic SurveyCode5
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to VerificationCode5
Segment AnythingCode5
Segment Anything Model for Medical Image Segmentation: Current Applications and Future DirectionsCode5
YOLOR-Based Multi-Task LearningCode5
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video SegmentationCode5
SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and MoreCode5
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and VideosCode5
RTMDet: An Empirical Study of Designing Real-Time Object DetectorsCode4
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNNCode4
GLIPv2: Unifying Localization and Vision-Language UnderstandingCode4
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory TreeCode4
LISA++: An Improved Baseline for Reasoning Segmentation with Large Language ModelCode4
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic KernelsCode4
Highly Accurate Dichotomous Image SegmentationCode4
Panoptic Feature Pyramid NetworksCode4
PVUW 2024 Challenge on Complex Video Understanding: Methods and ResultsCode4
Scalable 3D Panoptic Segmentation As Superpoint Graph ClusteringCode4
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision ApplicationsCode4
Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image SegmentationCode4
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment AnythingCode4
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and SegmentationCode4
Attention on the SphereCode4
LSKNet: A Foundation Lightweight Backbone for Remote SensingCode4
Deep Residual Learning for Image RecognitionCode4
EmbodiedSAM: Online Segment Any 3D Thing in Real TimeCode4
Detectron2 Object Detection & Manipulating Images using CartoonizationCode4
InstanceDiffusion: Instance-level Control for Image GenerationCode4
Show:102550
← PrevPage 1 of 296Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1InternImage-H (M3I Pre-training)Params (M)1,310Unverified
2ViT-P (InternImage-H)Validation mIoU63.6Unverified
3ONE-PEACEValidation mIoU63Unverified
4M3I Pre-training (InternImage-H)Validation mIoU62.9Unverified
5InternImage-HValidation mIoU62.9Unverified
6BEiT-3Validation mIoU62.8Unverified
7EVAValidation mIoU62.3Unverified
8ViT-P (OneFormer, InternImage-H)Validation mIoU61.6Unverified
9ViT-Adapter-L (Mask2Former, BEiTv2 pretrain)Validation mIoU61.5Unverified
10FD-SwinV2-GValidation mIoU61.4Unverified