Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–200 of 895 papers

Title	Date	Tasks	Status	Hype
Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2	Aug 3, 2024	DiversitySegmentation	CodeCode Available	3
SAM 2: Segment Anything in Images and Videos	Aug 1, 2024	Image SegmentationRobot Manipulation Generalization	CodeCode Available	12
Strike the Balance: On-the-Fly Uncertainty based User Interactions for Long-Term Video Object Segmentation	Jul 31, 2024	ObjectSegmentation	CodeCode Available	0
Disentangling spatio-temporal knowledge for weakly supervised object detection and segmentation in surgical video	Jul 22, 2024	DisentanglementKnowledge Distillation	CodeCode Available	0
ViLLa: Video Reasoning Segmentation with Large Language Model	Jul 18, 2024	Image SegmentationLanguage Modeling	CodeCode Available	1
FoodMem: Near Real-time and Precise Food Video Segmentation	Jul 16, 2024	SegmentationSemantic Segmentation	—Unverified	0
VISA: Reasoning Video Object Segmentation via Large Language Models	Jul 16, 2024	DecoderObject	CodeCode Available	3
Improving Unsupervised Video Object Segmentation via Fake Flow Generation	Jul 16, 2024	Objectobject-detection	—Unverified	0
Learning Spatial-Semantic Features for Robust Video Object Segmentation	Jul 10, 2024	ObjectSemantic Segmentation	—Unverified	0
ActionVOS: Actions as Prompts for Video Object Segmentation	Jul 10, 2024	ObjectReferring Video Object Segmentation	CodeCode Available	1
Rethinking Image-to-Video Adaptation: An Object-centric Perspective	Jul 9, 2024	Action RecognitionObject	—Unverified	0
General and Task-Oriented Video Segmentation	Jul 9, 2024	DisentanglementDiversity	CodeCode Available	1
Submodular video object proposal selection for semantic object segmentation	Jul 8, 2024	ObjectSegmentation	—Unverified	0
Non-parametric Contextual Relationship Learning for Semantic Video Object Segmentation	Jul 8, 2024	Semantic SegmentationVideo Object Segmentation	—Unverified	0
Context Propagation from Proposals for Semantic Video Object Segmentation	Jul 8, 2024	ObjectSegmentation	—Unverified	0
DaBiT: Depth and Blur informed Transformer for Joint Refocusing and Super-Resolution	Jul 1, 2024	DeblurringSuper-Resolution	CodeCode Available	0
Deep Unfolding-Aided Parameter Tuning for Plug-and-Play-Based Video Snapshot Compressive Imaging	Jun 28, 2024	DenoisingVideo Segmentation	—Unverified	0
MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation	Jun 27, 2024	Anomaly DetectionGraph Generation	—Unverified	0
Video Inpainting Localization with Contrastive Learning	Jun 25, 2024	Contrastive LearningDecoder	CodeCode Available	1
PVUW 2024 Challenge on Complex Video Understanding: Methods and Results	Jun 24, 2024	SegmentationSemantic Segmentation	CodeCode Available	4
Multimodal Segmentation for Vocal Tract Modeling	Jun 22, 2024	SegmentationVideo Segmentation	—Unverified	0
2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation	Jun 20, 2024	Instance SegmentationReferring Video Object Segmentation	—Unverified	0
SALI: Short-term Alignment and Long-term Interaction Network for Colonoscopy Video Polyp Segmentation	Jun 19, 2024	SegmentationVideo Polyp Segmentation	CodeCode Available	1
Trusted Video Inpainting Localization via Deep Attentive Noise Learning	Jun 19, 2024	Semantic SegmentationVideo Inpainting	CodeCode Available	0
ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection	Jun 18, 2024	object-detectionObject Detection	CodeCode Available	0
GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation	Jun 18, 2024	Contrastive LearningObject	—Unverified	0
2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation	Jun 12, 2024	Instance SegmentationSemantic Segmentation	—Unverified	0
RMem: Restricted Memory Banks Improve Video Object Segmentation	Jun 12, 2024	ObjectSemantic Segmentation	—Unverified	0
Visual Representation Learning with Stochastic Frame Prediction	Jun 11, 2024	DecoderPose Tracking	—Unverified	0
1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation	Jun 11, 2024	Referring Video Object SegmentationSegmentation	CodeCode Available	1
I-MPN: Inductive Message Passing Network for Efficient Human-in-the-Loop Annotation of Mobile Eye Tracking Data	Jun 10, 2024	NavigateObject	—Unverified	0
Training-Free Robust Interactive Video Object Segmentation	Jun 8, 2024	Interactive Video Object SegmentationObject	—Unverified	0
1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation	Jun 8, 2024	BenchmarkingInstance Segmentation	—Unverified	0
1st Place Solution for MOSE Track in CVPR 2024 PVUW Workshop: Complex Video Object Segmentation	Jun 7, 2024	ObjectSegmentation	—Unverified	0
A Semi-Self-Supervised Approach for Dense-Pattern Video Object Segmentation	Jun 7, 2024	Multi-Task LearningObject	—Unverified	0
3rd Place Solution for MeViS Track in CVPR 2024 PVUW workshop: Motion Expression guided Video Segmentation	Jun 7, 2024	Referring Video Object SegmentationSemantic Segmentation	—Unverified	0
3rd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation	Jun 6, 2024	ObjectPosition	—Unverified	0
Semi-supervised Video Semantic Segmentation Using Unreliable Pseudo Labels for PVUW2024	Jun 2, 2024	Scene ParsingScene Understanding	—Unverified	0
MCDS-VSS: Moving Camera Dynamic Scene Video Semantic Segmentation by Filtering with Self-Supervised Geometry and Motion	May 30, 2024	Decision MakingScene Segmentation	CodeCode Available	0
Automatic Dance Video Segmentation for Understanding Choreography	May 30, 2024	SegmentationVideo Segmentation	—Unverified	0
Lifelong Learning Using a Dynamically Growing Tree of Sub-networks for Domain Generalization in Video Object Segmentation	May 29, 2024	Domain GeneralizationLifelong learning	—Unverified	0
Zero-Shot Video Semantic Segmentation based on Pre-Trained Diffusion Models	May 27, 2024	SegmentationSemantic correspondence	CodeCode Available	2
One-shot Training for Video Object Segmentation	May 22, 2024	ObjectSemantic Segmentation	—Unverified	0
Harnessing Vision-Language Pretrained Models with Temporal-Aware Adaptation for Referring Video Object Segmentation	May 17, 2024	Referring Expression SegmentationReferring Video Object Segmentation	—Unverified	0
DeVOS: Flow-Guided Deformable Transformer for Video Object Segmentation	May 11, 2024	Optical Flow EstimationSemantic Segmentation	—Unverified	0
Global Motion Understanding in Large-Scale Video Object Segmentation	May 11, 2024	Instance SegmentationOptical Flow Estimation	—Unverified	0
Space-time Reinforcement Network for Video Object Segmentation	May 7, 2024	ObjectSemantic Segmentation	—Unverified	0
LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation	Apr 30, 2024	AttributeSemantic Segmentation	CodeCode Available	2
360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos	Apr 22, 2024	ObjectObject Tracking	—Unverified	0
Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object Segmentation	Apr 21, 2024	Semantic SegmentationVideo Object Segmentation	CodeCode Available	2

Show:10 25 50

← PrevPage 4 of 18Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	TDNet-50 [9]	mIoU	79.9	—	Unverified
3	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified