Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–325 of 895 papers

Title	Date	Tasks	Status
PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild	Apr 15, 2025	SegmentationSemantic Segmentation	—Unverified
MASSeg : 2nd Technical Report for 4th PVUW MOSE Track	Apr 14, 2025	Data AugmentationObject	CodeCode Available
FVOS for MOSE Track of 4th PVUW Challenge: 3rd Place Solution	Apr 13, 2025	SegmentationSemantic Segmentation	—Unverified
STSeg-Complex Video Object Segmentation: The 1st Solution for 4th PVUW MOSE Challenge	Apr 11, 2025	Semantic SegmentationVideo Object Segmentation	—Unverified
Multi-person Physics-based Pose Estimation for Combat Sports	Apr 11, 2025	3D Human Pose Estimation3D Multi-Person Pose Estimation	—Unverified
Saliency-Motion Guided Trunk-Collateral Network for Unsupervised Video Object Segmentation	Apr 8, 2025	Optical Flow EstimationSalient Object Detection	—Unverified
CamoSAM2: Motion-Appearance Induced Auto-Refining Prompts for Video Camouflaged Object Detection	Apr 1, 2025	Camouflaged Object Segmentationobject-detection	—Unverified
Zero-Shot 4D Lidar Panoptic Segmentation	Apr 1, 2025	DiversityPanoptic Segmentation	—Unverified
ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025	Mar 30, 2025	ObjectReferring Video Object Segmentation	CodeCode Available
Comparative Analysis of Image, Video, and Audio Classifiers for Automated News Video Segmentation	Mar 27, 2025	Binary ClassificationVideo Segmentation	—Unverified
Online Reasoning Video Segmentation with Just-in-Time Digital Twins	Mar 27, 2025	Reasoning SegmentationVideo Segmentation	—Unverified
One-Shot Medical Video Object Segmentation via Temporal Contrastive Memory Networks	Mar 19, 2025	DecoderSegmentation	CodeCode Available
Reducing Annotation Burden: Exploiting Image Knowledge for Few-Shot Medical Video Object Segmentation via Spatiotemporal Consistency Relearning	Mar 19, 2025	SegmentationSemantic Segmentation	CodeCode Available
Leveraging Vision-Language Models for Open-Vocabulary Instance Segmentation and Tracking	Mar 18, 2025	DescriptiveInstance Segmentation	CodeCode Available
AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations	Mar 17, 2025	Semantic SegmentationVideo Generation	—Unverified
SAM2 for Image and Video Segmentation: A Comprehensive Survey	Mar 17, 2025	Autonomous DrivingImage Segmentation	—Unverified
Leveraging Motion Information for Better Self-Supervised Video Correspondence Learning	Mar 15, 2025	ObjectSemantic Segmentation	—Unverified
Investigation of Frame Differences as Motion Cues for Video Object Segmentation	Mar 12, 2025	Optical Flow EstimationSegmentation	—Unverified
Open-World Skill Discovery from Unsegmented Demonstrations	Mar 11, 2025	Boundary DetectionEvent Segmentation	—Unverified
OmniSAM: Omnidirectional Segment Anything Model for UDA in Panoramic Semantic Segmentation	Mar 10, 2025	Pseudo LabelSemantic Segmentation	—Unverified
Rethinking Few-Shot Medical Image Segmentation by SAM2: A Training-Free Framework with Augmentative Prompting and Dynamic Matching	Mar 5, 2025	Data AugmentationFew-Shot Learning	—Unverified
Parameter-free Video Segmentation for Vision and Language Understanding	Mar 3, 2025	Question AnsweringVideo Question Answering	—Unverified
An Analysis of Data Transformation Effects on Segment Anything 2	Feb 25, 2025	Semantic SegmentationVideo Object Segmentation	—Unverified
Deep learning approaches to surgical video segmentation and object detection: A Scoping Review	Feb 23, 2025	object-detectionObject Detection	—Unverified
Pointmap Association and Piecewise-Plane Constraint for Consistent and Compact 3D Gaussian Segmentation Field	Feb 22, 2025	2D Panoptic Segmentation3D Scene Reconstruction	—Unverified

Show:10 25 50

← PrevPage 13 of 36Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	TDNet-50 [9]	mIoU	79.9	—	Unverified
3	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified