Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–300 of 895 papers

Title	Date	Tasks	Status	Hype
DeepPyramid+: Medical Image Segmentation using Pyramid View Fusion and Deformable Pyramid Reception	Dec 6, 2023	Image SegmentationMedical Image Segmentation	—Unverified	0
Efficient Multimodal Semantic Segmentation via Dual-Prompt Learning	Dec 1, 2023	Decoderobject-detection	CodeCode Available	1
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models	Nov 30, 2023	Semantic SegmentationVideo Editing	—Unverified	0
SimulFlow: Simultaneously Extracting Feature and Identifying Target for Unsupervised Video Object Segmentation	Nov 30, 2023	Objectobject-detection	—Unverified	0
A Simple Video Segmenter by Tracking Objects Along Axial Trajectories	Nov 30, 2023	GPUObject	CodeCode Available	1
Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation	Nov 29, 2023	ClusteringObject	CodeCode Available	1
SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation	Nov 24, 2023	Meta-LearningOne-Shot Segmentation	CodeCode Available	1
Unified Domain Adaptive Semantic Segmentation	Nov 22, 2023	Data AugmentationOptical Flow Estimation	CodeCode Available	1
DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields	Nov 18, 2023	DecoderPoint Cloud Segmentation	CodeCode Available	0
Correlation-aware active learning for surgery video segmentation	Nov 15, 2023	Active LearningContrastive Learning	—Unverified	0
Sketch-based Video Object Segmentation: Benchmark and Analysis	Nov 13, 2023	ObjectSegmentation	—Unverified	0
Learning the What and How of Annotation in Video Object Segmentation	Nov 8, 2023	SegmentationSemantic Segmentation	—Unverified	0
ISAR: A Benchmark for Single- and Few-Shot Object Instance Segmentation and Re-Identification	Nov 5, 2023	Instance SegmentationMulti-Object Tracking	—Unverified	0
Concatenated Masked Autoencoders as Spatial-Temporal Learner	Nov 2, 2023	Action RecognitionData Augmentation	CodeCode Available	1
Mask Propagation for Efficient Video Semantic Segmentation	Oct 29, 2023	Semantic SegmentationVideo Semantic Segmentation	CodeCode Available	1
SpVOS: Efficient Video Object Segmentation with Triple Sparse Convolution	Oct 23, 2023	ObjectSemantic Segmentation	—Unverified	0
Putting the Object Back into Video Object Segmentation	Oct 19, 2023	ObjectSegmentation	CodeCode Available	3
Understanding Video Transformers for Segmentation: A Survey of Application and Interpretability	Oct 18, 2023	SegmentationVideo Segmentation	—Unverified	0
Zero-Shot Open-Vocabulary Tracking with Large Pre-Trained Models	Oct 10, 2023	ObjectObject Tracking	—Unverified	0
Sub-token ViT Embedding via Stochastic Resonance Transformers	Oct 6, 2023	Depth EstimationDepth Prediction	CodeCode Available	0
CoralVOS: Dataset and Benchmark for Coral Video Segmentation	Oct 3, 2023	SegmentationSemantic Segmentation	—Unverified	0
SimLVSeg: Simplifying Left Ventricular Segmentation in 2D+Time Echocardiograms with Self- and Weakly-Supervised Learning	Sep 30, 2023	Left Ventricle SegmentationLV Segmentation	CodeCode Available	0
Memory-Efficient Continual Learning Object Segmentation for Long Video	Sep 26, 2023	Continual LearningObject	—Unverified	0
Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation	Sep 26, 2023	ObjectOptical Flow Estimation	CodeCode Available	1
Adversarial Attacks on Video Object Segmentation with Hard Region Discovery	Sep 25, 2023	Autonomous DrivingObject	—Unverified	0
MediViSTA: Medical Video Segmentation via Temporal Fusion SAM Adaptation for Echocardiography	Sep 24, 2023	Image SegmentationMedical Image Segmentation	CodeCode Available	1
Rethinking Amodal Video Segmentation from Learning Supervised Signals with Object-centric Representation	Sep 23, 2023	ObjectVideo Segmentation	CodeCode Available	0
SANPO: A Scene Understanding, Accessibility and Human Navigation Dataset	Sep 21, 2023	Autonomous VehiclesDepth Estimation	—Unverified	0
Efficient Long-Short Temporal Attention Network for Unsupervised Video Object Segmentation	Sep 21, 2023	Semantic SegmentationUnsupervised Video Object Segmentation	—Unverified	0
PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation	Sep 21, 2023	Autonomous DrivingSegmentation	CodeCode Available	1
Fully Transformer-Equipped Architecture for End-to-End Referring Video Object Segmentation	Sep 21, 2023	ObjectReferring Video Object Segmentation	—Unverified	0
MoDA: Leveraging Motion Priors from Videos for Advancing Unsupervised Domain Adaptation in Semantic Segmentation	Sep 21, 2023	Domain AdaptationImage Segmentation	CodeCode Available	0
GraphEcho: Graph-Driven Unsupervised Domain Adaptation for Echocardiogram Video Segmentation	Sep 20, 2023	Domain AdaptationGraph Matching	CodeCode Available	1
Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation	Sep 20, 2023	Image SegmentationSegmentation	CodeCode Available	0
GL-Fusion: Global-Local Fusion Network for Multi-view Echocardiogram Video Segmentation	Sep 20, 2023	Video SegmentationVideo Semantic Segmentation	CodeCode Available	0
CATR: Combinatorial-Dependence Audio-Queried Transformer for Audio-Visual Video Segmentation	Sep 18, 2023	Video SegmentationVideo Semantic Segmentation	CodeCode Available	1
Temporal-aware Hierarchical Mask Classification for Video Semantic Segmentation	Sep 14, 2023	ClassificationDecoder	CodeCode Available	0
Temporal Collection and Distribution for Referring Video Object Segmentation	Sep 7, 2023	ObjectReferring Video Object Segmentation	—Unverified	0
Tracking Anything with Decoupled Video Segmentation	Sep 7, 2023	Open-Vocabulary Video SegmentationOpen-World Video Segmentation	CodeCode Available	3
Robust Visual Tracking by Motion Analyzing	Sep 6, 2023	Object TrackingSegmentation	—Unverified	0
Learning Cross-Modal Affinity for Referring Video Object Segmentation Targeting Limited Samples	Sep 5, 2023	Referring Video Object SegmentationSemantic Segmentation	CodeCode Available	0
VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation	Aug 28, 2023	Instance SegmentationOptical Flow Estimation	CodeCode Available	3
Joint Modeling of Feature, Correspondence, and a Compressed Memory for Video Object Segmentation	Aug 25, 2023	Semantic SegmentationVideo Object Segmentation	—Unverified	0
Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation	Aug 25, 2023	ObjectObject Tracking	CodeCode Available	1
Robotic Scene Segmentation with Memory Network for Runtime Surgical Context Inference	Aug 24, 2023	Scene SegmentationSegmentation	CodeCode Available	0
LOCATE: Self-supervised Object Discovery via Flow-guided Graph-cut and Bootstrapped Self-training	Aug 22, 2023	ObjectObject Discovery	CodeCode Available	0
MEGA: Multimodal Alignment Aggregation and Distillation For Cinematic Video Segmentation	Aug 22, 2023	Scene SegmentationSegmentation	—Unverified	0
Scalable Video Object Segmentation with Simplified Framework	Aug 19, 2023	ObjectSemantic Segmentation	—Unverified	0
LaRS: A Diverse Panoptic Maritime Obstacle Detection Dataset and Benchmark	Aug 18, 2023	DiversityPanoptic Segmentation	CodeCode Available	1
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions	Aug 16, 2023	Motion Expressions Guided Video SegmentationObject	CodeCode Available	2

Show:10 25 50

← PrevPage 6 of 18Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
3	TDNet-50 [9]	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified