Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 226–250 of 895 papers

Title	Date	Tasks	Status	Hype
Is Two-shot All You Need? A Label-efficient Approach for Video Segmentation in Breast Ultrasound	Feb 7, 2024	AllLesion Segmentation	—Unverified	0
We're Not Using Videos Effectively: An Updated Domain Adaptive Video Segmentation Baseline	Feb 1, 2024	BenchmarkingDomain Adaptation	CodeCode Available	1
Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes	Jan 27, 2024	Motion EstimationSegmentation	CodeCode Available	0
Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention	Jan 25, 2024	Knowledge DistillationObject	—Unverified	0
Vivim: a Video Vision Mamba for Medical Video Segmentation	Jan 25, 2024	Lesion SegmentationMamba	CodeCode Available	2
Explore Synergistic Interaction Across Frames for Interactive Video Object Segmentation	Jan 23, 2024	Interactive Video Object SegmentationSemantic Segmentation	—Unverified	0
Understanding Video Transformers via Universal Concept Discovery	Jan 19, 2024	Action RecognitionDecision Making	—Unverified	0
OMG-Seg: Is One Model Good Enough For All Segmentation?	Jan 18, 2024	AllDecoder	CodeCode Available	5
RAP-SAM: Towards Real-Time All-Purpose Segment Anything	Jan 18, 2024	AllDecoder	CodeCode Available	3
Learning to Segment Referred Objects from Narrated Egocentric Videos	Jan 1, 2024	ObjectSegmentation	—Unverified	0
MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation	Jan 1, 2024	SegmentationVideo Segmentation	CodeCode Available	2
Infer from What You Have Seen Before: Temporally-dependent Classifier for Semi-supervised Video Segmentation	Jan 1, 2024	Representation LearningSemantic Segmentation	CodeCode Available	0
1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation	Jan 1, 2024	ObjectReferring Video Object Segmentation	CodeCode Available	1
Tracking with Human-Intent Reasoning	Dec 29, 2023	Language ModellingObject	CodeCode Available	1
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces	Dec 25, 2023	Image SegmentationObject	CodeCode Available	2
DVIS++: Improved Decoupled Framework for Universal Video Segmentation	Dec 20, 2023	Contrastive LearningDenoising	CodeCode Available	1
No More Shortcuts: Realizing the Potential of Temporal Self-Supervision	Dec 20, 2023	Action ClassificationAttribute	—Unverified	0
Appearance-Based Refinement for Object-Centric Motion Segmentation	Dec 18, 2023	Motion SegmentationObject	—Unverified	0
AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace Platform	Dec 17, 2023	Image SegmentationSegmentation	CodeCode Available	1
Artificial intelligence optical hardware empowers high-resolution hyperspectral video understanding at 1.2 Tb/s	Dec 17, 2023	Semantic SegmentationVideo Semantic Segmentation	—Unverified	0
Hierarchical Graph Pattern Understanding for Zero-Shot VOS	Dec 15, 2023	DecoderGraph Neural Network	CodeCode Available	0
TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking	Dec 13, 2023	Semantic SegmentationVideo Object Segmentation	—Unverified	0
Semi-supervised Active Learning for Video Action Detection	Dec 12, 2023	Action DetectionActive Learning	CodeCode Available	0
Flexible visual prompts for in-context learning in computer vision	Dec 11, 2023	Image SegmentationIn-Context Learning	CodeCode Available	0
GenDeF: Learning Generative Deformation Field for Video Generation	Dec 7, 2023	DisentanglementVideo Editing	—Unverified	0

Show:10 25 50

← PrevPage 10 of 36Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
3	TDNet-50 [9]	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified