Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 501–525 of 895 papers

Title	Date	Tasks	Status
An end-to-end generative framework for video segmentation and recognition	Sep 7, 2015	Video SegmentationVideo Semantic Segmentation	—Unverified
Appearance-Based Refinement for Object-Centric Motion Segmentation	Dec 18, 2023	Motion SegmentationObject	—Unverified
Approximate Policy Iteration for Budgeted Semantic Video Segmentation	Jul 26, 2016	Video SegmentationVideo Semantic Segmentation	—Unverified
Approximating DTW with a convolutional neural network on EEG data	Jan 30, 2023	Anomaly DetectionComputational Efficiency	—Unverified
Architecture Search of Dynamic Cells for Semantic Video Segmentation	Apr 4, 2019	GPUNeural Architecture Search	—Unverified
Artificial intelligence optical hardware empowers high-resolution hyperspectral video understanding at 1.2 Tb/s	Dec 17, 2023	Semantic SegmentationVideo Semantic Segmentation	—Unverified
A spatio-temporal network for video semantic segmentation in surgical videos	Jun 19, 2023	DecoderSegmentation	—Unverified
A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering	May 12, 2023	Edge Detectionmodel	—Unverified
A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented, Temporal and Depth-aware design	Mar 8, 2023	Autonomous DrivingAutonomous Vehicles	—Unverified
AutoDepthNet: High Frame Rate Depth Map Reconstruction using Commodity Depth and RGB Cameras	May 24, 2023	Depth EstimationGPU	—Unverified
Automatic Dance Video Segmentation for Understanding Choreography	May 30, 2024	SegmentationVideo Segmentation	—Unverified
Automatic Foreground Extraction from Imperfect Backgrounds using Multi-Agent Consensus Equilibrium	Aug 24, 2018	DenoisingImage Denoising	—Unverified
Automatic Interaction and Activity Recognition from Videos of Human Manual Demonstrations with Application to Anomaly Detection	Apr 19, 2023	Activity RecognitionAnomaly Detection	—Unverified
Automatic Real-time Background Cut for Portrait Videos	Apr 28, 2017	SegmentationSemantic Segmentation	—Unverified
Automatic Video Object Segmentation via Motion-Appearance-Stream Fusion and Instance-aware Segmentation	Dec 3, 2019	Foreground SegmentationInstance Segmentation	—Unverified
Automatic video scene segmentation based on spatial-temporal clues and rhythm	Dec 15, 2014	RetrievalRhythm	—Unverified
AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations	Mar 17, 2025	Semantic SegmentationVideo Generation	—Unverified
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation	Aug 1, 2022	ObjectOptical Flow Estimation	—Unverified
Beyond Semantic Image Segmentation : Exploring Efficient Inference in Video	Jul 1, 2015	Image SegmentationSegmentation	—Unverified
Bilateral Space Video Segmentation	Jun 1, 2016	SegmentationSemi-Supervised Video Object Segmentation	—Unverified
Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning	Apr 9, 2018	Metric LearningObject	—Unverified
BoLTVOS: Box-Level Tracking for Video Object Segmentation	Apr 9, 2019	ObjectOne-shot visual object segmentation	—Unverified
Breaking The Ice: Video Segmentation for Close-Range Ice-Covered Waters	Nov 7, 2024	Image SegmentationOptical Flow Estimation	—Unverified
Breaking the "Object" in Video Object Segmentation	Dec 12, 2022	ObjectSemantic Segmentation	—Unverified
Bringing Background into the Foreground: Making All Classes Equal in Weakly-supervised Video Semantic Segmentation	Aug 15, 2017	AllAutonomous Navigation	—Unverified

Show:10 25 50

← PrevPage 21 of 36Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	TDNet-50 [9]	mIoU	79.9	—	Unverified
3	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified