Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 895 papers

Title	Date	Tasks	Status
PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild	Apr 15, 2025	SegmentationSemantic Segmentation	—Unverified
MASSeg : 2nd Technical Report for 4th PVUW MOSE Track	Apr 14, 2025	Data AugmentationObject	CodeCode Available
FVOS for MOSE Track of 4th PVUW Challenge: 3rd Place Solution	Apr 13, 2025	SegmentationSemantic Segmentation	—Unverified
Multi-person Physics-based Pose Estimation for Combat Sports	Apr 11, 2025	3D Human Pose Estimation3D Multi-Person Pose Estimation	—Unverified
STSeg-Complex Video Object Segmentation: The 1st Solution for 4th PVUW MOSE Challenge	Apr 11, 2025	Semantic SegmentationVideo Object Segmentation	—Unverified
Saliency-Motion Guided Trunk-Collateral Network for Unsupervised Video Object Segmentation	Apr 8, 2025	Optical Flow EstimationSalient Object Detection	—Unverified
Zero-Shot 4D Lidar Panoptic Segmentation	Apr 1, 2025	DiversityPanoptic Segmentation	—Unverified
CamoSAM2: Motion-Appearance Induced Auto-Refining Prompts for Video Camouflaged Object Detection	Apr 1, 2025	Camouflaged Object Segmentationobject-detection	—Unverified
ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025	Mar 30, 2025	ObjectReferring Video Object Segmentation	CodeCode Available
Online Reasoning Video Segmentation with Just-in-Time Digital Twins	Mar 27, 2025	Reasoning SegmentationVideo Segmentation	—Unverified
Comparative Analysis of Image, Video, and Audio Classifiers for Automated News Video Segmentation	Mar 27, 2025	Binary ClassificationVideo Segmentation	—Unverified
One-Shot Medical Video Object Segmentation via Temporal Contrastive Memory Networks	Mar 19, 2025	DecoderSegmentation	CodeCode Available
Reducing Annotation Burden: Exploiting Image Knowledge for Few-Shot Medical Video Object Segmentation via Spatiotemporal Consistency Relearning	Mar 19, 2025	SegmentationSemantic Segmentation	CodeCode Available
Leveraging Vision-Language Models for Open-Vocabulary Instance Segmentation and Tracking	Mar 18, 2025	DescriptiveInstance Segmentation	CodeCode Available
AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations	Mar 17, 2025	Semantic SegmentationVideo Generation	—Unverified
SAM2 for Image and Video Segmentation: A Comprehensive Survey	Mar 17, 2025	Autonomous DrivingImage Segmentation	—Unverified
Leveraging Motion Information for Better Self-Supervised Video Correspondence Learning	Mar 15, 2025	ObjectSemantic Segmentation	—Unverified
Investigation of Frame Differences as Motion Cues for Video Object Segmentation	Mar 12, 2025	Optical Flow EstimationSegmentation	—Unverified
Open-World Skill Discovery from Unsegmented Demonstrations	Mar 11, 2025	Boundary DetectionEvent Segmentation	—Unverified
OmniSAM: Omnidirectional Segment Anything Model for UDA in Panoramic Semantic Segmentation	Mar 10, 2025	Pseudo LabelSemantic Segmentation	—Unverified
Rethinking Few-Shot Medical Image Segmentation by SAM2: A Training-Free Framework with Augmentative Prompting and Dynamic Matching	Mar 5, 2025	Data AugmentationFew-Shot Learning	—Unverified
Parameter-free Video Segmentation for Vision and Language Understanding	Mar 3, 2025	Question AnsweringVideo Question Answering	—Unverified
An Analysis of Data Transformation Effects on Segment Anything 2	Feb 25, 2025	Semantic SegmentationVideo Object Segmentation	—Unverified
Deep learning approaches to surgical video segmentation and object detection: A Scoping Review	Feb 23, 2025	object-detectionObject Detection	—Unverified
Pointmap Association and Piecewise-Plane Constraint for Consistent and Compact 3D Gaussian Segmentation Field	Feb 22, 2025	2D Panoptic Segmentation3D Scene Reconstruction	—Unverified
Role of the Pretraining and the Adaptation data sizes for low-resource real-time MRI video segmentation	Feb 20, 2025	Video SegmentationVideo Semantic Segmentation	—Unverified
Wandering around: A bioinspired approach to visual attention through object motion sensitivity	Feb 10, 2025	Low-latency processingMotion Segmentation	CodeCode Available
HD-EPIC: A Highly-Detailed Egocentric Video Dataset	Feb 6, 2025	Action RecognitionNutrition	—Unverified
Efficient Portrait Matte Creation With Layer Diffusion and Connectivity Priors	Jan 27, 2025	Image MattingVideo Segmentation	—Unverified
ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations	Jan 24, 2025	DecoderObject	—Unverified
Efficient Frame Extraction: A Novel Approach Through Frame Similarity and Surgical Tool Tracking for Video Segmentation	Jan 19, 2025	Video SegmentationVideo Semantic Segmentation	CodeCode Available
Static Segmentation by Tracking: A Frustratingly Label-Efficient Approach to Fine-Grained Segmentation	Jan 12, 2025	Image RetrievalImage Segmentation	—Unverified
Multi-Context Temporal Consistent Modeling for Referring Video Object Segmentation	Jan 9, 2025	Referring Video Object SegmentationSemantic Segmentation	CodeCode Available
Segment Anything Model for Zero-shot Single Particle Tracking in Liquid Phase Transmission Electron Microscopy	Jan 6, 2025	Video SegmentationVideo Semantic Segmentation	CodeCode Available
VideoGLaMM : A Large Multimodal Model for Pixel-Level Visual Grounding in Videos	Jan 1, 2025	Large Language ModelVideo Segmentation	—Unverified
EntitySAM: Segment Everything in Video	Jan 1, 2025	DecoderObject	—Unverified
DTOS: Dynamic Time Object Sensing with Large Multimodal Model	Jan 1, 2025	Moment RetrievalReferring Video Object Segmentation	CodeCode Available
Decoupled Motion Expression Video Segmentation	Jan 1, 2025	Instance SegmentationReferring Video Object Segmentation	—Unverified
VidSeg: Training-free Video Semantic Segmentation based on Diffusion Models	Jan 1, 2025	SegmentationSemantic Segmentation	—Unverified
Semantic and Sequential Alignment for Referring Video Object Segmentation	Jan 1, 2025	Instance SegmentationReferring Video Object Segmentation	—Unverified
Is Segment Anything Model 2 All You Need for Surgery Video Segmentation? A Systematic Evaluation	Dec 31, 2024	AllSegmentation	—Unverified
Generative Video Propagation	Dec 27, 2024	Image to Video GenerationVideo Generation	—Unverified
When SAM2 Meets Video Shadow and Mirror Detection	Dec 26, 2024	Image SegmentationMirror Detection	CodeCode Available
Collaborative Hybrid Propagator for Temporal Misalignment in Audio-Visual Segmentation	Dec 11, 2024	Video SegmentationVideo Semantic Segmentation	—Unverified
Static-Dynamic Class-level Perception Consistency in Video Semantic Segmentation	Dec 11, 2024	Autonomous DrivingContrastive Learning	—Unverified
Stable Mean Teacher for Semi-supervised Video Action Detection	Dec 10, 2024	Action DetectionSemantic Segmentation	CodeCode Available
Video Decomposition Prior: A Methodology to Decompose Videos into Layers	Dec 6, 2024	Semantic SegmentationVideo Editing	—Unverified
Track Anything Behind Everything: Zero-Shot Amodal Video Object Segmentation	Nov 28, 2024	3D ReconstructionSegmentation	—Unverified
RoMo: Robust Motion Segmentation Improves Structure from Motion	Nov 27, 2024	Camera CalibrationMotion Segmentation	—Unverified
ClickTrack: Towards Real-time Interactive Single Object Tracking	Nov 20, 2024	ObjectObject Tracking	—Unverified

Show:10 25 50

← PrevPage 7 of 18Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	TDNet-50 [9]	mIoU	79.9	—	Unverified
3	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified