Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–400 of 895 papers

Title	Date	Tasks	Status
Geometric Algebra Planes: Convex Implicit Neural Volumes	Nov 20, 2024	DecoderVideo Segmentation	—Unverified
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level	Nov 15, 2024	Benchmarkingcounterfactual	—Unverified
Zero-shot capability of SAM-family models for bone segmentation in CT scans	Nov 13, 2024	Image SegmentationMedical Image Segmentation	—Unverified
MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data	Nov 12, 2024	SegmentationUncertainty Quantification	CodeCode Available
GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting	Nov 12, 2024	3DGSgraph construction	—Unverified
Breaking The Ice: Video Segmentation for Close-Range Ice-Covered Waters	Nov 7, 2024	Image SegmentationOptical Flow Estimation	—Unverified
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos	Nov 7, 2024	DecoderLanguage Modeling	—Unverified
Event-guided Low-light Video Semantic Segmentation	Nov 1, 2024	DecoderSemantic Segmentation	—Unverified
Continuous Spatio-Temporal Memory Networks for 4D Cardiac Cine MRI Segmentation	Oct 30, 2024	AnatomyMRI segmentation	CodeCode Available
Addressing Issues with Working Memory in Video Object Segmentation	Oct 29, 2024	Inductive BiasObject	—Unverified
VideoSAM: A Large Vision Foundation Model for High-Speed Video Segmentation	Oct 22, 2024	SegmentationVideo Segmentation	CodeCode Available
Temporal-Enhanced Multimodal Transformer for Referring Multi-Object Tracking and Segmentation	Oct 17, 2024	Multi-Object TrackingMulti-Object Tracking and Segmentation	—Unverified
Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation	Oct 16, 2024	BenchmarkingPanoptic Segmentation	—Unverified
VideoSAM: Open-World Video Segmentation	Oct 11, 2024	Autonomous DrivingDecoder	—Unverified
Shift and matching queries for video semantic segmentation	Oct 10, 2024	Image SegmentationSegmentation	—Unverified
Memory Matching is not Enough: Jointly Improving Memory Matching and Decoding for Video Object Segmentation	Sep 22, 2024	Semantic SegmentationSemi-Supervised Video Object Segmentation	—Unverified
Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision	Sep 14, 2024	Video SegmentationVideo Semantic Segmentation	—Unverified
LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation	Sep 9, 2024	ObjectReferring Video Object Segmentation	—Unverified
Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOS	Aug 29, 2024	ObjectObject Recognition	CodeCode Available
CSS-Segment: 2nd Place Report of LSVOS Challenge VOS Track	Aug 24, 2024	Autonomous DrivingObject	—Unverified
The 2nd Solution for LSVOS Challenge RVOS Track: Spatial-temporal Refinement for Consistent Semantic Segmentation	Aug 22, 2024	Referring Video Object SegmentationSegmentation	—Unverified
The Instance-centric Transformer for the RVOS Track of LSVOS Challenge: 3rd Place Solution	Aug 20, 2024	Referring Video Object SegmentationRetrieval	—Unverified
Rethinking Video Segmentation with Masked Video Consistency: Did the Model Learn as Intended?	Aug 20, 2024	Image SegmentationSegmentation	—Unverified
LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS	Aug 20, 2024	Instance SegmentationObject	—Unverified
Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track	Aug 19, 2024	ObjectSegmentation	—Unverified
UNINEXT-Cutie: The 1st Solution for LSVOS Challenge RVOS Track	Aug 19, 2024	Referring Video Object SegmentationSemantic Segmentation	—Unverified
3D-Aware Instance Segmentation and Tracking in Egocentric Videos	Aug 19, 2024	3D Object ReconstructionInstance Segmentation	—Unverified
SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation	Aug 8, 2024	DecoderInteractive Segmentation	—Unverified
Is SAM 2 Better than SAM in Medical Image Segmentation?	Aug 8, 2024	Image SegmentationMedical Image Segmentation	—Unverified
Saliency Detection in Educational Videos: Analyzing the Performance of Current Models, Identifying Limitations and Advancement Directions	Aug 8, 2024	Information RetrievalSaliency Detection	—Unverified
Novel adaptation of video segmentation to 3D MRI: efficient zero-shot knee segmentation with SAM2	Aug 8, 2024	Image SegmentationMedical Image Analysis	—Unverified
Fast Sprite Decomposition from Animated Graphics	Aug 7, 2024	Semantic SegmentationVideo Object Segmentation	—Unverified
Performance and Non-adversarial Robustness of the Segment Anything Model 2 in Surgical Video Segmentation	Aug 7, 2024	Adversarial RobustnessImage Segmentation	—Unverified
Biomedical SAM 2: Segment Anything in Biomedical Images and Videos	Aug 6, 2024	Image SegmentationMedical Image Segmentation	CodeCode Available
Strike the Balance: On-the-Fly Uncertainty based User Interactions for Long-Term Video Object Segmentation	Jul 31, 2024	ObjectSegmentation	CodeCode Available
Disentangling spatio-temporal knowledge for weakly supervised object detection and segmentation in surgical video	Jul 22, 2024	DisentanglementKnowledge Distillation	CodeCode Available
Improving Unsupervised Video Object Segmentation via Fake Flow Generation	Jul 16, 2024	Objectobject-detection	—Unverified
FoodMem: Near Real-time and Precise Food Video Segmentation	Jul 16, 2024	SegmentationSemantic Segmentation	—Unverified
Learning Spatial-Semantic Features for Robust Video Object Segmentation	Jul 10, 2024	ObjectSemantic Segmentation	—Unverified
Rethinking Image-to-Video Adaptation: An Object-centric Perspective	Jul 9, 2024	Action RecognitionObject	—Unverified
Non-parametric Contextual Relationship Learning for Semantic Video Object Segmentation	Jul 8, 2024	Semantic SegmentationVideo Object Segmentation	—Unverified
Submodular video object proposal selection for semantic object segmentation	Jul 8, 2024	ObjectSegmentation	—Unverified
Context Propagation from Proposals for Semantic Video Object Segmentation	Jul 8, 2024	ObjectSegmentation	—Unverified
DaBiT: Depth and Blur informed Transformer for Joint Refocusing and Super-Resolution	Jul 1, 2024	DeblurringSuper-Resolution	CodeCode Available
Deep Unfolding-Aided Parameter Tuning for Plug-and-Play-Based Video Snapshot Compressive Imaging	Jun 28, 2024	DenoisingVideo Segmentation	—Unverified
MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation	Jun 27, 2024	Anomaly DetectionGraph Generation	—Unverified
Multimodal Segmentation for Vocal Tract Modeling	Jun 22, 2024	SegmentationVideo Segmentation	—Unverified
2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation	Jun 20, 2024	Instance SegmentationReferring Video Object Segmentation	—Unverified
Trusted Video Inpainting Localization via Deep Attentive Noise Learning	Jun 19, 2024	Semantic SegmentationVideo Inpainting	CodeCode Available
GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation	Jun 18, 2024	Contrastive LearningObject	—Unverified

Show:10 25 50

← PrevPage 8 of 18Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	TDNet-50 [9]	mIoU	79.9	—	Unverified
3	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified