Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–125 of 895 papers

Title	Date	Tasks	Status	Hype
Track Anything Behind Everything: Zero-Shot Amodal Video Object Segmentation	Nov 28, 2024	3D ReconstructionSegmentation	—Unverified	0
RoMo: Robust Motion Segmentation Improves Structure from Motion	Nov 27, 2024	Camera CalibrationMotion Segmentation	—Unverified	0
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation	Nov 26, 2024	Natural Language UnderstandingReferring Video Object Segmentation	CodeCode Available	3
Geometric Algebra Planes: Convex Implicit Neural Volumes	Nov 20, 2024	DecoderVideo Segmentation	—Unverified	0
ClickTrack: Towards Real-time Interactive Single Object Tracking	Nov 20, 2024	ObjectObject Tracking	—Unverified	0
IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos	Nov 18, 2024	Pose EstimationSemantic Segmentation	CodeCode Available	2
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level	Nov 15, 2024	Benchmarkingcounterfactual	—Unverified	0
Zero-shot capability of SAM-family models for bone segmentation in CT scans	Nov 13, 2024	Image SegmentationMedical Image Segmentation	—Unverified	0
GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting	Nov 12, 2024	3DGSgraph construction	—Unverified	0
MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data	Nov 12, 2024	SegmentationUncertainty Quantification	CodeCode Available	0
Breaking The Ice: Video Segmentation for Close-Range Ice-Covered Waters	Nov 7, 2024	Image SegmentationOptical Flow Estimation	—Unverified	0
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos	Nov 7, 2024	DecoderLanguage Modeling	—Unverified	0
LiVOS: Light Video Object Segmentation with Gated Linear Matching	Nov 5, 2024	GPUSemantic Segmentation	CodeCode Available	1
Event-guided Low-light Video Semantic Segmentation	Nov 1, 2024	DecoderSemantic Segmentation	—Unverified	0
Continuous Spatio-Temporal Memory Networks for 4D Cardiac Cine MRI Segmentation	Oct 30, 2024	AnatomyMRI segmentation	CodeCode Available	0
Addressing Issues with Working Memory in Video Object Segmentation	Oct 29, 2024	Inductive BiasObject	—Unverified	0
SMITE: Segment Me In TimE	Oct 24, 2024	SegmentationSemantic Segmentation	CodeCode Available	3
VideoSAM: A Large Vision Foundation Model for High-Speed Video Segmentation	Oct 22, 2024	SegmentationVideo Segmentation	CodeCode Available	0
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree	Oct 21, 2024	Heuristic SearchObject	CodeCode Available	4
Temporal-Enhanced Multimodal Transformer for Referring Multi-Object Tracking and Segmentation	Oct 17, 2024	Multi-Object TrackingMulti-Object Tracking and Segmentation	—Unverified	0
Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation	Oct 16, 2024	BenchmarkingPanoptic Segmentation	—Unverified	0
VideoSAM: Open-World Video Segmentation	Oct 11, 2024	Autonomous DrivingDecoder	—Unverified	0
Shift and matching queries for video semantic segmentation	Oct 10, 2024	Image SegmentationSegmentation	—Unverified	0
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos	Sep 29, 2024	AllImage Segmentation	CodeCode Available	2
X-Prompt: Multi-modal Visual Prompt for Video Object Segmentation	Sep 28, 2024	Semantic SegmentationVideo Object Segmentation	CodeCode Available	1

Show:10 25 50

← PrevPage 5 of 36Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	TDNet-50 [9]	mIoU	79.9	—	Unverified
3	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified