Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–300 of 895 papers

Title	Date	Tasks	Status	Hype
Video Panoptic Segmentation	Jun 19, 2020	Instance SegmentationPanoptic Segmentation	CodeCode Available	1
Video Semantic Segmentation with Distortion-Aware Feature Correction	Jun 18, 2020	Image SegmentationOptical Flow Estimation	CodeCode Available	1
Real-Time Video Inference on Edge Devices via Adaptive Model Streaming	Jun 11, 2020	Knowledge DistillationSemantic Segmentation	CodeCode Available	1
Temporal Aggregate Representations for Long-Range Video Understanding	Jun 1, 2020	Action AnticipationAction Recognition	CodeCode Available	1
Physarum Powered Differentiable Linear Programming Layers and Applications	Apr 30, 2020	Few-Shot LearningMeta-Learning	CodeCode Available	1
Fast Template Matching and Update for Video Object Tracking and Segmentation	Apr 16, 2020	Object Trackingreinforcement-learning	CodeCode Available	1
A Transductive Approach for Video Object Segmentation	Apr 15, 2020	Instance SegmentationObject	CodeCode Available	1
Temporally Distributed Networks for Fast Video Semantic Segmentation	Apr 3, 2020	Knowledge DistillationReal-Time Semantic Segmentation	CodeCode Available	1
TapLab: A Fast Framework for Semantic Video Segmentation Tapping into Compressed-Domain Knowledge	Mar 30, 2020	GPUImage Segmentation	CodeCode Available	1
Learning What to Learn for Video Object Segmentation	Mar 25, 2020	Few-Shot LearningObject	CodeCode Available	1
Collaborative Video Object Segmentation by Foreground-Background Integration	Mar 18, 2020	ObjectOne-shot visual object segmentation	CodeCode Available	1
Learning Video Object Segmentation from Unlabeled Videos	Mar 10, 2020	ObjectRepresentation Learning	CodeCode Available	1
Motion-Attentive Transition for Zero-Shot Video Object Segmentation	Mar 9, 2020	DecoderObject	CodeCode Available	1
State-Aware Tracker for Real-Time Video Object Segmentation	Mar 1, 2020	SegmentationSemantic Segmentation	CodeCode Available	1
Learning Fast and Robust Target Models for Video Object Segmentation	Feb 27, 2020	One-shot visual object segmentationSegmentation	CodeCode Available	1
Efficient Semantic Video Segmentation with Per-frame Inference	Feb 26, 2020	Knowledge DistillationOptical Flow Estimation	CodeCode Available	1
MAST: A Memory-Augmented Self-supervised Tracker	Feb 18, 2020	Semantic SegmentationSemi-Supervised Video Object Segmentation	CodeCode Available	1
Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation	Feb 17, 2020	GPUOne-shot visual object segmentation	CodeCode Available	1
Fast Video Object Segmentation using the Global Context Module	Jan 30, 2020	ObjectSegmentation	CodeCode Available	1
See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks	Jan 19, 2020	Semantic SegmentationUnsupervised Video Object Segmentation	CodeCode Available	1
Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks	Jan 19, 2020	Graph Neural NetworkSegmentation	CodeCode Available	1
UnOVOST: Unsupervised Offline Video Object Segmentation and Tracking	Jan 15, 2020	ObjectSegmentation	CodeCode Available	1
Separable Convolutional LSTMs for Faster Video Segmentation	Jul 16, 2019	GPUImage Segmentation	CodeCode Available	1
Semantic Segmentation of Video Sequences with Convolutional LSTMs	May 3, 2019	DecoderImage Segmentation	CodeCode Available	1
Video Object Segmentation using Space-Time Memory Networks	Apr 1, 2019	Interactive Video Object SegmentationObject	CodeCode Available	1
Online Model Distillation for Efficient Video Inference	Dec 6, 2018	modelSegmentation	CodeCode Available	1
Tukey-Inspired Video Object Segmentation	Nov 19, 2018	ObjectSegmentation	CodeCode Available	1
YouTube-VOS: Sequence-to-Sequence Video Object Segmentation	Sep 3, 2018	Image SegmentationObject	CodeCode Available	1
Actor and Action Video Segmentation from a Sentence	Mar 20, 2018	Action SegmentationDecoder	CodeCode Available	1
Pyramid Scene Parsing Network	Dec 4, 2016	Dichotomous Image SegmentationImage Classification	CodeCode Available	1
Deep Feature Flow for Video Recognition	Nov 23, 2016	Video RecognitionVideo Semantic Segmentation	CodeCode Available	1
Clockwork Convnets for Video Semantic Segmentation	Aug 11, 2016	Image SegmentationScheduling	CodeCode Available	1
SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction	Jul 21, 2025	ObjectSegmentation	—Unverified	0
Memory-Augmented SAM2 for Training-Free Surgical Video Segmentation	Jul 13, 2025	SegmentationSemantic Segmentation	—Unverified	0
MUVOD: A Novel Multi-view Video Object Segmentation Dataset and A Benchmark for 3D Segmentation	Jul 10, 2025	NeRFObject	—Unverified	0
CogGen: A Learner-Centered Generative AI Architecture for Intelligent Tutoring with Programming Video	Jun 25, 2025	Knowledge TracingVideo Segmentation	—Unverified	0
Leader360V: The Large-scale, Real-world 360 Video Dataset for Multi-task Learning in Diverse Environment	Jun 17, 2025	Autonomous DrivingInstance Segmentation	—Unverified	0
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects	Jun 16, 2025	BenchmarkingInstance Segmentation	—Unverified	0
Q-SAM2: Accurate Quantization for Segment Anything Model 2	Jun 11, 2025	QuantizationVideo Segmentation	—Unverified	0
THU-Warwick Submission for EPIC-KITCHEN Challenge 2025: Semi-Supervised Video Object Segmentation	Jun 7, 2025	SegmentationSemantic Segmentation	—Unverified	0
InterRVOS: Interaction-aware Referring Video Object Segmentation	Jun 3, 2025	8kObject	—Unverified	0
OmniFall: A Unified Staged-to-Wild Benchmark for Human Fall Detection	May 26, 2025	Video SegmentationVideo Semantic Segmentation	CodeCode Available	0
ThinkVideo: High-Quality Reasoning Video Segmentation with Chain of Thoughts	May 24, 2025	Image SegmentationInstance Segmentation	CodeCode Available	0
Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation	May 19, 2025	Referring Video Object SegmentationSemantic Segmentation	—Unverified	0
FlowCut: Unsupervised Video Instance Segmentation via Temporal Mask Matching	May 19, 2025	Instance SegmentationSegmentation	—Unverified	0
VolE: A Point-cloud Framework for Food 3D Reconstruction and Volume Estimation	May 15, 2025	3D ReconstructionCamera Calibration	—Unverified	0
6D Pose Estimation on Spoons and Hands	May 5, 2025	6D Pose EstimationPose Estimation	—Unverified	0
MoSAM: Motion-Guided Segment Anything Model with Spatial-Temporal Memory Selection	Apr 30, 2025	Instance SegmentationInteractive Segmentation	—Unverified	0
RGB-D Video Object Segmentation via Enhanced Multi-store Feature Memory	Apr 23, 2025	SegmentationSemantic Segmentation	—Unverified	0
Few-Shot Referring Video Single- and Multi-Object Segmentation via Cross-Modal Affinity with Instance Sequence Matching	Apr 18, 2025	ObjectReferring Video Object Segmentation	CodeCode Available	0

Show:10 25 50

← PrevPage 6 of 18Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	TDNet-50 [9]	mIoU	79.9	—	Unverified
3	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified