Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 126–150 of 895 papers

Title	Date	Tasks	Status	Hype
Underwater Camouflaged Object Tracking Meets Vision-Language SAM2	Sep 25, 2024	ObjectObject Tracking	CodeCode Available	5
Memory Matching is not Enough: Jointly Improving Memory Matching and Decoding for Video Object Segmentation	Sep 22, 2024	Semantic SegmentationSemi-Supervised Video Object Segmentation	—Unverified	0
Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision	Sep 14, 2024	Video SegmentationVideo Semantic Segmentation	—Unverified	0
Self-Prompting Polyp Segmentation in Colonoscopy using Hybrid Yolo-SAM 2 Model	Sep 14, 2024	Medical Image SegmentationPolyp Segmentation	CodeCode Available	2
LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation	Sep 9, 2024	ObjectReferring Video Object Segmentation	—Unverified	0
Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOS	Aug 29, 2024	ObjectObject Recognition	CodeCode Available	0
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation	Aug 28, 2024	ObjectSemantic Segmentation	CodeCode Available	2
CSS-Segment: 2nd Place Report of LSVOS Challenge VOS Track	Aug 24, 2024	Autonomous DrivingObject	—Unverified	0
Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey	Aug 23, 2024	Image SegmentationSegmentation	CodeCode Available	5
The 2nd Solution for LSVOS Challenge RVOS Track: Spatial-temporal Refinement for Consistent Semantic Segmentation	Aug 22, 2024	Referring Video Object SegmentationSegmentation	—Unverified	0
The Instance-centric Transformer for the RVOS Track of LSVOS Challenge: 3rd Place Solution	Aug 20, 2024	Referring Video Object SegmentationRetrieval	—Unverified	0
Rethinking Video Segmentation with Masked Video Consistency: Did the Model Learn as Intended?	Aug 20, 2024	Image SegmentationSegmentation	—Unverified	0
LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS	Aug 20, 2024	Instance SegmentationObject	—Unverified	0
Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track	Aug 19, 2024	ObjectSegmentation	—Unverified	0
3D-Aware Instance Segmentation and Tracking in Egocentric Videos	Aug 19, 2024	3D Object ReconstructionInstance Segmentation	—Unverified	0
UNINEXT-Cutie: The 1st Solution for LSVOS Challenge RVOS Track	Aug 19, 2024	Referring Video Object SegmentationSemantic Segmentation	—Unverified	0
Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning	Aug 15, 2024	SegmentationVideo Segmentation	CodeCode Available	2
Novel adaptation of video segmentation to 3D MRI: efficient zero-shot knee segmentation with SAM2	Aug 8, 2024	Image SegmentationMedical Image Analysis	—Unverified	0
SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation	Aug 8, 2024	DecoderInteractive Segmentation	—Unverified	0
Saliency Detection in Educational Videos: Analyzing the Performance of Current Models, Identifying Limitations and Advancement Directions	Aug 8, 2024	Information RetrievalSaliency Detection	—Unverified	0
Is SAM 2 Better than SAM in Medical Image Segmentation?	Aug 8, 2024	Image SegmentationMedical Image Segmentation	—Unverified	0
Performance and Non-adversarial Robustness of the Segment Anything Model 2 in Surgical Video Segmentation	Aug 7, 2024	Adversarial RobustnessImage Segmentation	—Unverified	0
Fast Sprite Decomposition from Animated Graphics	Aug 7, 2024	Semantic SegmentationVideo Object Segmentation	—Unverified	0
Segment Anything in Medical Images and Videos: Benchmark and Deployment	Aug 6, 2024	BenchmarkingSegmentation	CodeCode Available	7
Biomedical SAM 2: Segment Anything in Biomedical Images and Videos	Aug 6, 2024	Image SegmentationMedical Image Segmentation	CodeCode Available	0

Show:10 25 50

← PrevPage 6 of 36Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
3	TDNet-50 [9]	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified