Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 826–850 of 895 papers

Title	Date	Tasks	Status
MCDS-VSS: Moving Camera Dynamic Scene Video Semantic Segmentation by Filtering with Self-Supervised Geometry and Motion	May 30, 2024	Decision MakingScene Segmentation	CodeCode Available
MASSeg : 2nd Technical Report for 4th PVUW MOSE Track	Apr 14, 2025	Data AugmentationObject	CodeCode Available
DTOS: Dynamic Time Object Sensing with Large Multimodal Model	Jan 1, 2025	Moment RetrievalReferring Video Object Segmentation	CodeCode Available
Boosting Video Object Segmentation based on Scale Inconsistency	May 2, 2022	ObjectSemantic Segmentation	CodeCode Available
DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation	Sep 27, 2019	ObjectOne-shot visual object segmentation	CodeCode Available
Disentangling spatio-temporal knowledge for weakly supervised object detection and segmentation in surgical video	Jul 22, 2024	DisentanglementKnowledge Distillation	CodeCode Available
Semantic Video Segmentation : Exploring Inference Efficiency	Sep 4, 2015	Image SegmentationSegmentation	CodeCode Available
Two-Level Temporal Relation Model for Online Video Instance Segmentation	Oct 30, 2022	Graph Neural NetworkInstance Segmentation	CodeCode Available
Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOS	Aug 29, 2024	ObjectObject Recognition	CodeCode Available
Semi-supervised Active Learning for Video Action Detection	Dec 12, 2023	Action DetectionActive Learning	CodeCode Available
Video Object Segmentation using Supervoxel-Based Gerrymandering	Apr 18, 2017	ObjectSemantic Segmentation	CodeCode Available
Mask Selection and Propagation for Unsupervised Video Object Segmentation	Jan 5, 2021	SegmentationSemantic Segmentation	CodeCode Available
Video Object Segmentation using Teacher-Student Adaptation in a Human Robot Interaction (HRI) Setting	Oct 17, 2018	Incremental LearningRobot Manipulation	CodeCode Available
Lucid Data Dreaming for Video Object Segmentation	Mar 28, 2017	Multiple Object TrackingObject	CodeCode Available
Separable Structure Modeling for Semi-supervised Video Object Segmentation	Feb 18, 2021	ObjectOne-shot visual object segmentation	CodeCode Available
ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection	Jun 18, 2024	object-detectionObject Detection	CodeCode Available
ALBA : Reinforcement Learning for Video Object Segmentation	May 26, 2020	ObjectOne-shot visual object segmentation	CodeCode Available
Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation	Mar 17, 2023	SegmentationSelf-Supervised Learning	CodeCode Available
SimLVSeg: Simplifying Left Ventricular Segmentation in 2D+Time Echocardiograms with Self- and Weakly-Supervised Learning	Sep 30, 2023	Left Ventricle SegmentationLV Segmentation	CodeCode Available
Siamese Network with Interactive Transformer for Video Object Segmentation	Dec 28, 2021	DecoderObject	CodeCode Available
LSMVOS: Long-Short-Term Similarity Matching for Video Object	Sep 2, 2020	ObjectOptical Flow Estimation	CodeCode Available
Asymmetric Cross-Guided Attention Network for Actor and Action Video Segmentation From Natural Language Query	Oct 1, 2019	Referring Expression SegmentationSegmentation	CodeCode Available
Delta Distillation for Efficient Video Processing	Mar 17, 2022	Knowledge Distillationobject-detection	CodeCode Available
Deep Common Feature Mining for Efficient Video Semantic Segmentation	Mar 5, 2024	Semantic SegmentationVideo Semantic Segmentation	CodeCode Available
DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields	Nov 18, 2023	DecoderPoint Cloud Segmentation	CodeCode Available

Show:10 25 50

← PrevPage 34 of 36Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	TDNet-50 [9]	mIoU	79.9	—	Unverified
3	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified