Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 895 papers

Title	Date	Tasks	Status	Hype	Score
SAM 2: Segment Anything in Images and Videos	Aug 1, 2024	Image SegmentationRobot Manipulation Generalization	CodeCode Available	12	5
Segment Anything in Medical Images and Videos: Benchmark and Deployment	Aug 6, 2024	BenchmarkingSegmentation	CodeCode Available	7	5
Efficient Track Anything	Nov 28, 2024	ObjectSegmentation	CodeCode Available	7	5
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos	Jan 7, 2025	2kLanguage Modeling	CodeCode Available	5	5
Underwater Camouflaged Object Tracking Meets Vision-Language SAM2	Sep 25, 2024	ObjectObject Tracking	CodeCode Available	5	5
4th PVUW MeViS 3rd Place Report: Sa2VA	Apr 1, 2025	Language ModelingLanguage Modelling	CodeCode Available	5	5
Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey	Aug 23, 2024	Image SegmentationSegmentation	CodeCode Available	5	5
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation	Apr 7, 2025	Inference OptimizationReferring Video Object Segmentation	CodeCode Available	5	5
OMG-Seg: Is One Model Good Enough For All Segmentation?	Jan 18, 2024	AllDecoder	CodeCode Available	5	5
SiamMask: A Framework for Fast Online Object Tracking and Segmentation	Jul 5, 2022	Multiple Object TrackingObject	CodeCode Available	4	5
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree	Oct 21, 2024	Heuristic SearchObject	CodeCode Available	4	5
PVUW 2024 Challenge on Complex Video Understanding: Methods and Results	Jun 24, 2024	SegmentationSemantic Segmentation	CodeCode Available	4	5
SegGPT: Segmenting Everything In Context	Apr 6, 2023	Few-Shot Semantic SegmentationIn-Context Learning	CodeCode Available	4	5
MedSAM2: Segment Anything in 3D Medical Images and Videos	Apr 4, 2025	SegmentationVideo Segmentation	CodeCode Available	4	5
EdgeTAM: On-Device Track Anything Model	Jan 13, 2025	modelVideo Segmentation	CodeCode Available	4	5
SMITE: Segment Me In TimE	Oct 24, 2024	SegmentationSemantic Segmentation	CodeCode Available	3	5
Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes	Dec 2, 2024	In-Context LearningVideo Segmentation	CodeCode Available	3	5
Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2	Aug 3, 2024	DiversitySegmentation	CodeCode Available	3	5
RAP-SAM: Towards Real-Time All-Purpose Segment Anything	Jan 18, 2024	AllDecoder	CodeCode Available	3	5
Putting the Object Back into Video Object Segmentation	Oct 19, 2023	ObjectSegmentation	CodeCode Available	3	5
Moving Object Segmentation: All You Need Is SAM (and Flow)	Apr 18, 2024	AllMotion Segmentation	CodeCode Available	3	5
Personalize Segment Anything Model with One Shot	May 4, 2023	Image Generationmodel	CodeCode Available	3	5
Min-Max Similarity: A Contrastive Semi-Supervised Deep Learning Network for Surgical Tools Segmentation	Mar 29, 2022	Contrastive LearningSegmentation	CodeCode Available	3	5
PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model	Mar 21, 2024	DecoderGeneralized Referring Expression Segmentation	CodeCode Available	3	5
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation	Nov 26, 2024	Natural Language UnderstandingReferring Video Object Segmentation	CodeCode Available	3	5

Show:10 25 50

← PrevPage 1 of 36Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
3	TDNet-50 [9]	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified