Video Semantic Segmentation

The goal of video semantic segmentation is to assign a predefined class to each pixel in all frames of a video. This requires the model not only to predict accurate segmentation masks but also to ensure that these masks remain temporally consistent across frames. This task has broad applications in areas such as autonomous driving, medical video analysis, and AR/VR.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–250 of 895 papers

Title	Date	Tasks	Status	Hype
Moving Object Segmentation: All You Need Is SAM (and Flow)	Apr 18, 2024	AllMotion Segmentation	CodeCode Available	3
arcjetCV: an open-source software to analyze material ablation	Apr 17, 2024	Video SegmentationVideo Semantic Segmentation	CodeCode Available	0
Spatial-Temporal Multi-level Association for Video Object Segmentation	Apr 9, 2024	ObjectSegmentation	—Unverified	0
Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation	Apr 4, 2024	Contrastive LearningReferring Expression	CodeCode Available	2
Event-assisted Low-Light Video Object Segmentation	Apr 2, 2024	ObjectSemantic Segmentation	CodeCode Available	1
DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries	Mar 29, 2024	ObjectVideo Instance Segmentation	CodeCode Available	2
Temporally Consistent Referring Video Object Segmentation with Hybrid Memory	Mar 28, 2024	HTRObject	CodeCode Available	1
Annolid: Annotate, Segment, and Track Anything You Need	Mar 27, 2024	Instance SegmentationSegmentation	CodeCode Available	0
Efficient Video Object Segmentation via Modulated Cross-Attention Memory	Mar 26, 2024	GPUObject	CodeCode Available	2
Triple Component Matrix Factorization: Untangling Global, Local, and Noisy Components	Mar 21, 2024	Anomaly DetectionVideo Segmentation	—Unverified	0
PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model	Mar 21, 2024	DecoderGeneralized Referring Expression Segmentation	CodeCode Available	3
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation	Mar 18, 2024	Referring Video Object SegmentationSemantic Segmentation	CodeCode Available	1
Video Object Segmentation with Dynamic Query Modulation	Mar 18, 2024	ObjectSegmentation	CodeCode Available	1
OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework	Mar 13, 2024	AllManagement	—Unverified	0
Augmenting Efficient Real-time Surgical Instrument Segmentation in Video with Point Tracking and Segment Anything	Mar 12, 2024	GPUPoint Tracking	CodeCode Available	1
ClickVOS: Click Video Object Segmentation	Mar 10, 2024	ObjectSegmentation	CodeCode Available	0
Depth-aware Test-Time Training for Zero-shot Video Object Segmentation	Mar 7, 2024	Depth EstimationDepth Prediction	CodeCode Available	1
Deep Common Feature Mining for Efficient Video Semantic Segmentation	Mar 5, 2024	Semantic SegmentationVideo Semantic Segmentation	CodeCode Available	0
Motion-Corrected Moving Average: Including Post-Hoc Temporal Information for Improved Video Segmentation	Mar 5, 2024	Optical Flow EstimationSegmentation	—Unverified	0
VideoMAC: Video Masked Autoencoders Meet ConvNets	Feb 29, 2024	Pose TrackingRepresentation Learning	CodeCode Available	1
UniVS: Unified and Universal Video Segmentation with Prompts as Queries	Feb 28, 2024	DecoderReferring Expression Segmentation	CodeCode Available	3
PolypNextLSTM: A lightweight and fast polyp video segmentation network using ConvNext and ConvLSTM	Feb 18, 2024	SegmentationVideo Segmentation	CodeCode Available	0
Lester: rotoscope animation through video object segmentation and tracking	Feb 15, 2024	3D Human Pose EstimationObject	CodeCode Available	1
Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation	Feb 14, 2024	DecoderObject	—Unverified	0
Point-VOS: Pointing Up Video Object Segmentation	Feb 8, 2024	ObjectSemantic Segmentation	—Unverified	0
Is Two-shot All You Need? A Label-efficient Approach for Video Segmentation in Breast Ultrasound	Feb 7, 2024	AllLesion Segmentation	—Unverified	0
We're Not Using Videos Effectively: An Updated Domain Adaptive Video Segmentation Baseline	Feb 1, 2024	BenchmarkingDomain Adaptation	CodeCode Available	1
Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes	Jan 27, 2024	Motion EstimationSegmentation	CodeCode Available	0
Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention	Jan 25, 2024	Knowledge DistillationObject	—Unverified	0
Vivim: a Video Vision Mamba for Medical Video Segmentation	Jan 25, 2024	Lesion SegmentationMamba	CodeCode Available	2
Explore Synergistic Interaction Across Frames for Interactive Video Object Segmentation	Jan 23, 2024	Interactive Video Object SegmentationSemantic Segmentation	—Unverified	0
Understanding Video Transformers via Universal Concept Discovery	Jan 19, 2024	Action RecognitionDecision Making	—Unverified	0
OMG-Seg: Is One Model Good Enough For All Segmentation?	Jan 18, 2024	AllDecoder	CodeCode Available	5
RAP-SAM: Towards Real-Time All-Purpose Segment Anything	Jan 18, 2024	AllDecoder	CodeCode Available	3
Learning to Segment Referred Objects from Narrated Egocentric Videos	Jan 1, 2024	ObjectSegmentation	—Unverified	0
MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation	Jan 1, 2024	SegmentationVideo Segmentation	CodeCode Available	2
Infer from What You Have Seen Before: Temporally-dependent Classifier for Semi-supervised Video Segmentation	Jan 1, 2024	Representation LearningSemantic Segmentation	CodeCode Available	0
1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation	Jan 1, 2024	ObjectReferring Video Object Segmentation	CodeCode Available	1
Tracking with Human-Intent Reasoning	Dec 29, 2023	Language ModellingObject	CodeCode Available	1
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces	Dec 25, 2023	Image SegmentationObject	CodeCode Available	2
DVIS++: Improved Decoupled Framework for Universal Video Segmentation	Dec 20, 2023	Contrastive LearningDenoising	CodeCode Available	1
No More Shortcuts: Realizing the Potential of Temporal Self-Supervision	Dec 20, 2023	Action ClassificationAttribute	—Unverified	0
Appearance-Based Refinement for Object-Centric Motion Segmentation	Dec 18, 2023	Motion SegmentationObject	—Unverified	0
AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace Platform	Dec 17, 2023	Image SegmentationSegmentation	CodeCode Available	1
Artificial intelligence optical hardware empowers high-resolution hyperspectral video understanding at 1.2 Tb/s	Dec 17, 2023	Semantic SegmentationVideo Semantic Segmentation	—Unverified	0
Hierarchical Graph Pattern Understanding for Zero-Shot VOS	Dec 15, 2023	DecoderGraph Neural Network	CodeCode Available	0
TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking	Dec 13, 2023	Semantic SegmentationVideo Object Segmentation	—Unverified	0
Semi-supervised Active Learning for Video Action Detection	Dec 12, 2023	Action DetectionActive Learning	CodeCode Available	0
Flexible visual prompts for in-context learning in computer vision	Dec 11, 2023	Image SegmentationIn-Context Learning	CodeCode Available	0
GenDeF: Learning Generative Deformation Field for Video Generation	Dec 7, 2023	DisentanglementVideo Editing	—Unverified	0

Show:10 25 50

← PrevPage 5 of 18Next →

All datasets Cityscapes val CamVid VSPW LaRS Multispectral Video Semantic Segmentation

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	mIoU	80.3	—	Unverified
2	DeltaDist-DDRNet-39	mIoU	79.9	—	Unverified
3	TDNet-50 [9]	mIoU	79.9	—	Unverified
4	PSPNet-101 [20]	mIoU	79.7	—	Unverified
5	PSPNet-50 [20]	mIoU	78.1	—	Unverified
6	LVS [12]	mIoU	76.8	—	Unverified
7	GRFP [15]	mIoU	73.6	—	Unverified
8	FCN-50 [14]	mIoU	70.1	—	Unverified
9	DFF [22]	mIoU	69.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TMANet-50	Mean IoU	76.5	—	Unverified
2	ETC-MobileNet	Mean IoU	76.3	—	Unverified
3	TDNet-50	Mean IoU	76.2	—	Unverified
4	PSPNet-50	Mean IoU	76	—	Unverified
5	Netwarp	Mean IoU	74.7	—	Unverified
6	GRFP	Mean IoU	67.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DVIS++(VIT-L)	mIoU	63.8	—	Unverified
2	UniVS(Swin-L)	mIoU	59.8	—	Unverified
3	Tube-Link(Swin-large)	mIoU	59.6	—	Unverified
4	MRCFA(MiT-B5)	mIoU	49.9	—	Unverified
5	CFFM(MiT-B5)	mIoU	49.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaSR-T (ResNet-101)	Q	60.1	—	Unverified
2	TMANet (ResNet-50)	Q	57.5	—	Unverified
3	CSANet (ResNet-101)	Q	49.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MVNet(DeepLabV3)	mIoU	54.52	—	Unverified
2	MVNet(PSPNet)	mIoU	54.36	—	Unverified
3	MVNet(FCN)	mIoU	53.9	—	Unverified