Semantic Segmentation

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 14763 papers

Title	Date	Tasks	Status	Hype
SAM 2: Segment Anything in Images and Videos	Aug 1, 2024	Image SegmentationRobot Manipulation Generalization	CodeCode Available	11
YOLO-World: Real-Time Open-Vocabulary Object Detection	Jan 30, 2024	Instance SegmentationLanguage Modeling	CodeCode Available	9
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data	Jan 19, 2024	Data AugmentationDepth Estimation	CodeCode Available	9
Efficient MedSAMs: Segment Anything in Medical Images on Laptop	Dec 20, 2024	Image SegmentationMedical Image Segmentation	CodeCode Available	7
Efficient Track Anything	Nov 28, 2024	ObjectSegmentation	CodeCode Available	7
MambaVision: A Hybrid Mamba-Transformer Vision Backbone	Jul 10, 2024	Image ClassificationInstance Segmentation	CodeCode Available	7
MambaOut: Do We Really Need Mamba for Vision?	May 13, 2024	image-classificationImage Classification	CodeCode Available	7
Bilateral Reference for High-Resolution Dichotomous Image Segmentation	Jan 7, 2024	Camouflaged Object SegmentationDichotomous Image Segmentation	CodeCode Available	7
U-Net v2: Rethinking the Skip Connections of U-Net for Medical Image Segmentation	Nov 29, 2023	Computational EfficiencyDecoder	CodeCode Available	6
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution	Jul 12, 2023	FairnessImage Classification	CodeCode Available	6
DINOv2: Learning Robust Visual Features without Supervision	Apr 14, 2023	Depth EstimationDomain Generalization	CodeCode Available	6
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification	May 22, 2025	2D Semantic SegmentationActivity Prediction	CodeCode Available	5
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation	Apr 7, 2025	Inference OptimizationReferring Video Object Segmentation	CodeCode Available	5
4th PVUW MeViS 3rd Place Report: Sa2VA	Apr 1, 2025	Language ModelingLanguage Modelling	CodeCode Available	5
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos	Jan 7, 2025	2kLanguage Modeling	CodeCode Available	5
Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey	Aug 23, 2024	Image SegmentationSegmentation	CodeCode Available	5
SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More	Aug 8, 2024	Image SegmentationMedical Image Segmentation	CodeCode Available	5
Segment Anything for Videos: A Systematic Survey	Jul 31, 2024	Image SegmentationRobot Manipulation Generalization	CodeCode Available	5
PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery	Jun 16, 2024	DecoderEarth Observation	CodeCode Available	5
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities	Jun 13, 2024	Instance Segmentationmultimodal generation	CodeCode Available	5
Matching Anything by Segmenting Anything	Jun 6, 2024	Domain GeneralizationMultiple Object Tracking	CodeCode Available	5
FeatUp: A Model-Agnostic Framework for Features at Any Resolution	Mar 15, 2024	Depth EstimationDepth Prediction	CodeCode Available	5
OMG-Seg: Is One Model Good Enough For All Segmentation?	Jan 18, 2024	AllDecoder	CodeCode Available	5
Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions	Jan 7, 2024	BenchmarkingImage Segmentation	CodeCode Available	5
YOLOR-Based Multi-Task Learning	Sep 29, 2023	Image CaptioningInstance Segmentation	CodeCode Available	5
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications	Jun 25, 2023	CPUDecoder	CodeCode Available	5
Infinite Photorealistic Worlds using Procedural Generation	Jun 15, 2023	3D Reconstructionobject-detection	CodeCode Available	5
Track Anything: Segment Anything Meets Videos	Apr 24, 2023	Image SegmentationObject Tracking	CodeCode Available	5
Segment Anything	Apr 5, 2023	Event-based Object SegmentationImage Segmentation	CodeCode Available	5
A ConvNet for the 2020s	Jan 10, 2022	ClassificationDomain Generalization	CodeCode Available	5
Attention on the Sphere	May 16, 2025	Depth EstimationImage Segmentation	CodeCode Available	4
Your ViT is Secretly an Image Segmentation Model	Mar 24, 2025	DecoderImage Segmentation	CodeCode Available	4
Sonata: Self-Supervised Learning of Reliable Point Representations	Mar 20, 2025	3D Semantic SegmentationSelf-Supervised Learning	CodeCode Available	4
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels	Feb 27, 2025	Image ClassificationInstance Segmentation	CodeCode Available	4
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree	Oct 21, 2024	Heuristic SearchObject	CodeCode Available	4
EmbodiedSAM: Online Segment Any 3D Thing in Real Time	Aug 21, 2024	3D Instance SegmentationGPU	CodeCode Available	4
Medical SAM 2: Segment medical images as video via Segment Anything Model 2	Aug 1, 2024	Image SegmentationInteractive Segmentation	CodeCode Available	4
PVUW 2024 Challenge on Complex Video Understanding: Methods and Results	Jun 24, 2024	SegmentationSemantic Segmentation	CodeCode Available	4
LSKNet: A Foundation Lightweight Backbone for Remote Sensing	Mar 18, 2024	Change Detectionobject-detection	CodeCode Available	4
Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation	Feb 16, 2024	Cardiac SegmentationDecoder	CodeCode Available	4
Semi-Mamba-UNet: Pixel-Level Contrastive and Pixel-Level Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation	Feb 11, 2024	Cardiac SegmentationContrastive Learning	CodeCode Available	4
Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation	Feb 7, 2024	Cardiac SegmentationComputational Efficiency	CodeCode Available	4
InstanceDiffusion: Instance-level Control for Image Generation	Feb 5, 2024	Conditional Text-to-Image SynthesisImage Generation	CodeCode Available	4
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation	Jan 24, 2024	Image SegmentationMamba	CodeCode Available	4
Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering	Jan 12, 2024	3D Panoptic Segmentation3D Semantic Segmentation	CodeCode Available	4
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications	Jan 11, 2024	image-classificationImage Classification	CodeCode Available	4
LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model	Dec 28, 2023	Instance SegmentationLanguage Modeling	CodeCode Available	4
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything	Dec 1, 2023	Decoderimage-classification	CodeCode Available	4
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing	Nov 1, 2023	AllImage Generation	CodeCode Available	4
3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers	Oct 11, 2023	DecoderImage Segmentation	CodeCode Available	4

Show:10 25 50

← PrevPage 1 of 296Next →

All datasets ADE20K NYU-Depth V2 Cityscapes test Cityscapes val ADE20K val PASCAL Context S3DIS Area5 PASCAL VOC 2012 test S3DIS ScanNet SUN-RGBD DensePASS

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	InternImage-H (M3I Pre-training)	Params (M)	1,310	—	Unverified
2	ViT-P (InternImage-H)	Validation mIoU	63.6	—	Unverified
3	ONE-PEACE	Validation mIoU	63	—	Unverified
4	M3I Pre-training (InternImage-H)	Validation mIoU	62.9	—	Unverified
5	InternImage-H	Validation mIoU	62.9	—	Unverified
6	BEiT-3	Validation mIoU	62.8	—	Unverified
7	EVA	Validation mIoU	62.3	—	Unverified
8	ViT-P (OneFormer, InternImage-H)	Validation mIoU	61.6	—	Unverified
9	ViT-Adapter-L (Mask2Former, BEiTv2 pretrain)	Validation mIoU	61.5	—	Unverified
10	FD-SwinV2-G	Validation mIoU	61.4	—	Unverified