Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 895 papers

Title	Date	Tasks	Status	Hype
VideoMAC: Video Masked Autoencoders Meet ConvNets	Feb 29, 2024	Pose TrackingRepresentation Learning	CodeCode Available	1
Lester: rotoscope animation through video object segmentation and tracking	Feb 15, 2024	3D Human Pose EstimationObject	CodeCode Available	1
We're Not Using Videos Effectively: An Updated Domain Adaptive Video Segmentation Baseline	Feb 1, 2024	BenchmarkingDomain Adaptation	CodeCode Available	1
1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation	Jan 1, 2024	ObjectReferring Video Object Segmentation	CodeCode Available	1
Tracking with Human-Intent Reasoning	Dec 29, 2023	Language ModellingObject	CodeCode Available	1
DVIS++: Improved Decoupled Framework for Universal Video Segmentation	Dec 20, 2023	Contrastive LearningDenoising	CodeCode Available	1
AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace Platform	Dec 17, 2023	Image SegmentationSegmentation	CodeCode Available	1
Efficient Multimodal Semantic Segmentation via Dual-Prompt Learning	Dec 1, 2023	Decoderobject-detection	CodeCode Available	1
A Simple Video Segmenter by Tracking Objects Along Axial Trajectories	Nov 30, 2023	GPUObject	CodeCode Available	1
Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation	Nov 29, 2023	ClusteringObject	CodeCode Available	1
SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation	Nov 24, 2023	Meta-LearningOne-Shot Segmentation	CodeCode Available	1
Unified Domain Adaptive Semantic Segmentation	Nov 22, 2023	Data AugmentationOptical Flow Estimation	CodeCode Available	1
Concatenated Masked Autoencoders as Spatial-Temporal Learner	Nov 2, 2023	Action RecognitionData Augmentation	CodeCode Available	1
Mask Propagation for Efficient Video Semantic Segmentation	Oct 29, 2023	Semantic SegmentationVideo Semantic Segmentation	CodeCode Available	1
Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation	Sep 26, 2023	ObjectOptical Flow Estimation	CodeCode Available	1
MediViSTA: Medical Video Segmentation via Temporal Fusion SAM Adaptation for Echocardiography	Sep 24, 2023	Image SegmentationMedical Image Segmentation	CodeCode Available	1
PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation	Sep 21, 2023	Autonomous DrivingSegmentation	CodeCode Available	1
GraphEcho: Graph-Driven Unsupervised Domain Adaptation for Echocardiogram Video Segmentation	Sep 20, 2023	Domain AdaptationGraph Matching	CodeCode Available	1
CATR: Combinatorial-Dependence Audio-Queried Transformer for Audio-Visual Video Segmentation	Sep 18, 2023	Video SegmentationVideo Semantic Segmentation	CodeCode Available	1
Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation	Aug 25, 2023	ObjectObject Tracking	CodeCode Available	1
LaRS: A Diverse Panoptic Maritime Obstacle Detection Dataset and Benchmark	Aug 18, 2023	DiversityPanoptic Segmentation	CodeCode Available	1
Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation	Aug 13, 2023	Semantic SegmentationVideo Object Segmentation	CodeCode Available	1
Stochastic positional embeddings improve masked image modeling	Jul 31, 2023	Language ModellingMasked Language Modeling	CodeCode Available	1
Spectrum-guided Multi-granularity Referring Video Object Segmentation	Jul 25, 2023	ObjectReferring Expression Segmentation	CodeCode Available	1
OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation	Jul 18, 2023	Referring Expression SegmentationReferring Video Object Segmentation	CodeCode Available	1
NVDS+: Towards Efficient and Versatile Neural Stabilizer for Video Depth Estimation	Jul 17, 2023	3D ReconstructionDepth Estimation	CodeCode Available	1
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation	Jul 3, 2023	Image SegmentationReferring Expression	CodeCode Available	1
LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation	Jun 14, 2023	Referring Expression SegmentationReferring Video Object Segmentation	CodeCode Available	1
3rd Place Solution for PVUW2023 VSS Track: A Large Model for Semantic Segmentation on VSPW	Jun 4, 2023	PositionSegmentation	CodeCode Available	1
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation	May 26, 2023	cross-modal alignmentObject	CodeCode Available	1
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation	May 25, 2023	ObjectReferring Expression Segmentation	CodeCode Available	1
UVOSAM: A Mask-free Paradigm for Unsupervised Video Object Segmentation via Segment Anything Model	May 22, 2023	Image SegmentationObject	CodeCode Available	1
Event-Free Moving Object Segmentation from Moving Ego Vehicle	Apr 28, 2023	Autonomous DrivingBenchmarking	CodeCode Available	1
Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping	Apr 17, 2023	Motion SegmentationObject	CodeCode Available	1
Segment Everything Everywhere All at Once	Apr 13, 2023	AllDecoder	CodeCode Available	1
Boosting Video Object Segmentation via Space-time Correspondence Learning	Apr 13, 2023	ObjectSegmentation	CodeCode Available	1
DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks	Apr 2, 2023	DiversityObject Tracking	CodeCode Available	1
Reliability-Hierarchical Memory Network for Scribble-Supervised Video Object Segmentation	Mar 25, 2023	Semantic SegmentationVideo Object Segmentation	CodeCode Available	1
CrOC: Cross-View Online Clustering for Dense Visual Representation Learning	Mar 23, 2023	ClusteringOnline Clustering	CodeCode Available	1
Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation	Mar 22, 2023	Contrastive LearningSegmentation	CodeCode Available	1
Two-shot Video Object Segmentation	Mar 21, 2023	ObjectPseudo Label	CodeCode Available	1
Adaptive Multi-source Predictor for Zero-shot Video Object Segmentation	Mar 18, 2023	ObjectOptical Flow Estimation	CodeCode Available	1
Global Knowledge Calibration for Fast Open-Vocabulary Segmentation	Mar 16, 2023	Knowledge DistillationOpen Vocabulary Semantic Segmentation	CodeCode Available	1
Guided Slot Attention for Unsupervised Video Object Segmentation	Mar 15, 2023	ObjectSemantic Segmentation	CodeCode Available	1
Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos	Mar 13, 2023	SegmentationSemantic Segmentation	CodeCode Available	1
Video-SwinUNet: Spatio-temporal Deep Learning Framework for VFSS Instance Segmentation	Feb 22, 2023	DecoderImage Segmentation	CodeCode Available	1
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation	Feb 14, 2023	DecoderImage Segmentation	CodeCode Available	1
Self-Supervised Unseen Object Instance Segmentation via Long-Term Robot Interaction	Feb 7, 2023	Instance SegmentationMulti-Object Tracking	CodeCode Available	1
TarViS: A Unified Approach for Target-based Video Segmentation	Jan 6, 2023	Instance SegmentationPanoptic Segmentation	CodeCode Available	1
End-to-End Video Matting With Trimap Propagation	Jan 1, 2023	Image MattingSegmentation	CodeCode Available	1

Show:10 25 50

← PrevPage 3 of 18Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	TDNet-50 [9]	mIoU	79.9	—	Unverified
3	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified