Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–450 of 895 papers

Title	Date	Tasks	Status
ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection	Jun 18, 2024	object-detectionObject Detection	CodeCode Available
2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation	Jun 12, 2024	Instance SegmentationSemantic Segmentation	—Unverified
RMem: Restricted Memory Banks Improve Video Object Segmentation	Jun 12, 2024	ObjectSemantic Segmentation	—Unverified
Visual Representation Learning with Stochastic Frame Prediction	Jun 11, 2024	DecoderPose Tracking	—Unverified
I-MPN: Inductive Message Passing Network for Efficient Human-in-the-Loop Annotation of Mobile Eye Tracking Data	Jun 10, 2024	NavigateObject	—Unverified
Training-Free Robust Interactive Video Object Segmentation	Jun 8, 2024	Interactive Video Object SegmentationObject	—Unverified
1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation	Jun 8, 2024	BenchmarkingInstance Segmentation	—Unverified
1st Place Solution for MOSE Track in CVPR 2024 PVUW Workshop: Complex Video Object Segmentation	Jun 7, 2024	ObjectSegmentation	—Unverified
3rd Place Solution for MeViS Track in CVPR 2024 PVUW workshop: Motion Expression guided Video Segmentation	Jun 7, 2024	Referring Video Object SegmentationSemantic Segmentation	—Unverified
A Semi-Self-Supervised Approach for Dense-Pattern Video Object Segmentation	Jun 7, 2024	Multi-Task LearningObject	—Unverified
3rd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation	Jun 6, 2024	ObjectPosition	—Unverified
Semi-supervised Video Semantic Segmentation Using Unreliable Pseudo Labels for PVUW2024	Jun 2, 2024	Scene ParsingScene Understanding	—Unverified
Automatic Dance Video Segmentation for Understanding Choreography	May 30, 2024	SegmentationVideo Segmentation	—Unverified
MCDS-VSS: Moving Camera Dynamic Scene Video Semantic Segmentation by Filtering with Self-Supervised Geometry and Motion	May 30, 2024	Decision MakingScene Segmentation	CodeCode Available
Lifelong Learning Using a Dynamically Growing Tree of Sub-networks for Domain Generalization in Video Object Segmentation	May 29, 2024	Domain GeneralizationLifelong learning	—Unverified
One-shot Training for Video Object Segmentation	May 22, 2024	ObjectSemantic Segmentation	—Unverified
Harnessing Vision-Language Pretrained Models with Temporal-Aware Adaptation for Referring Video Object Segmentation	May 17, 2024	Referring Expression SegmentationReferring Video Object Segmentation	—Unverified
Global Motion Understanding in Large-Scale Video Object Segmentation	May 11, 2024	Instance SegmentationOptical Flow Estimation	—Unverified
DeVOS: Flow-Guided Deformable Transformer for Video Object Segmentation	May 11, 2024	Optical Flow EstimationSemantic Segmentation	—Unverified
Space-time Reinforcement Network for Video Object Segmentation	May 7, 2024	ObjectSemantic Segmentation	—Unverified
360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos	Apr 22, 2024	ObjectObject Tracking	—Unverified
arcjetCV: an open-source software to analyze material ablation	Apr 17, 2024	Video SegmentationVideo Semantic Segmentation	CodeCode Available
Spatial-Temporal Multi-level Association for Video Object Segmentation	Apr 9, 2024	ObjectSegmentation	—Unverified
Annolid: Annotate, Segment, and Track Anything You Need	Mar 27, 2024	Instance SegmentationSegmentation	CodeCode Available
Triple Component Matrix Factorization: Untangling Global, Local, and Noisy Components	Mar 21, 2024	Anomaly DetectionVideo Segmentation	—Unverified
OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework	Mar 13, 2024	AllManagement	—Unverified
ClickVOS: Click Video Object Segmentation	Mar 10, 2024	ObjectSegmentation	CodeCode Available
Deep Common Feature Mining for Efficient Video Semantic Segmentation	Mar 5, 2024	Semantic SegmentationVideo Semantic Segmentation	CodeCode Available
Motion-Corrected Moving Average: Including Post-Hoc Temporal Information for Improved Video Segmentation	Mar 5, 2024	Optical Flow EstimationSegmentation	—Unverified
PolypNextLSTM: A lightweight and fast polyp video segmentation network using ConvNext and ConvLSTM	Feb 18, 2024	SegmentationVideo Segmentation	CodeCode Available
Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation	Feb 14, 2024	DecoderObject	—Unverified
Point-VOS: Pointing Up Video Object Segmentation	Feb 8, 2024	ObjectSemantic Segmentation	—Unverified
Is Two-shot All You Need? A Label-efficient Approach for Video Segmentation in Breast Ultrasound	Feb 7, 2024	AllLesion Segmentation	—Unverified
Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes	Jan 27, 2024	Motion EstimationSegmentation	CodeCode Available
Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention	Jan 25, 2024	Knowledge DistillationObject	—Unverified
Explore Synergistic Interaction Across Frames for Interactive Video Object Segmentation	Jan 23, 2024	Interactive Video Object SegmentationSemantic Segmentation	—Unverified
Understanding Video Transformers via Universal Concept Discovery	Jan 19, 2024	Action RecognitionDecision Making	—Unverified
Infer from What You Have Seen Before: Temporally-dependent Classifier for Semi-supervised Video Segmentation	Jan 1, 2024	Representation LearningSemantic Segmentation	CodeCode Available
Learning to Segment Referred Objects from Narrated Egocentric Videos	Jan 1, 2024	ObjectSegmentation	—Unverified
No More Shortcuts: Realizing the Potential of Temporal Self-Supervision	Dec 20, 2023	Action ClassificationAttribute	—Unverified
Appearance-Based Refinement for Object-Centric Motion Segmentation	Dec 18, 2023	Motion SegmentationObject	—Unverified
Artificial intelligence optical hardware empowers high-resolution hyperspectral video understanding at 1.2 Tb/s	Dec 17, 2023	Semantic SegmentationVideo Semantic Segmentation	—Unverified
Hierarchical Graph Pattern Understanding for Zero-Shot VOS	Dec 15, 2023	DecoderGraph Neural Network	CodeCode Available
TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking	Dec 13, 2023	Semantic SegmentationVideo Object Segmentation	—Unverified
Semi-supervised Active Learning for Video Action Detection	Dec 12, 2023	Action DetectionActive Learning	CodeCode Available
Flexible visual prompts for in-context learning in computer vision	Dec 11, 2023	Image SegmentationIn-Context Learning	CodeCode Available
GenDeF: Learning Generative Deformation Field for Video Generation	Dec 7, 2023	DisentanglementVideo Editing	—Unverified
DeepPyramid+: Medical Image Segmentation using Pyramid View Fusion and Deformable Pyramid Reception	Dec 6, 2023	Image SegmentationMedical Image Segmentation	—Unverified
SimulFlow: Simultaneously Extracting Feature and Identifying Target for Unsupervised Video Object Segmentation	Nov 30, 2023	Objectobject-detection	—Unverified
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models	Nov 30, 2023	Semantic SegmentationVideo Editing	—Unverified

Show:10 25 50

← PrevPage 9 of 18Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	TDNet-50 [9]	mIoU	79.9	—	Unverified
3	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified