| TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action | May 2, 2025 | Dense CaptioningHighlight Detection | CodeCode Available | 1 | 5 |
| Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward | Sep 25, 2022 | DecoderVideo Editing | CodeCode Available | 1 | 5 |
| MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation | Jan 23, 2025 | Referring Expression SegmentationReferring Video Object Segmentation | CodeCode Available | 1 | 5 |
| Multi-Granularity Video Object Segmentation | Dec 2, 2024 | ObjectSegmentation | CodeCode Available | 1 | 5 |
| 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation | Jun 11, 2024 | Referring Video Object SegmentationSegmentation | CodeCode Available | 1 | 5 |
| Coarse to Fine Multi-Resolution Temporal Convolutional Network | May 23, 2021 | Action SegmentationDecoder | CodeCode Available | 1 | 5 |
| PolyFormer: Referring Image Segmentation as Sequential Polygon Generation | Feb 14, 2023 | DecoderImage Segmentation | CodeCode Available | 1 | 5 |
| Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation | Apr 6, 2022 | Optical Flow EstimationReferring Expression Segmentation | CodeCode Available | 1 | 5 |
| BST: Badminton Stroke-type Transformer for Skeleton-based Action Recognition in Racket Sports | Feb 28, 2025 | Action RecognitionLine Detection | CodeCode Available | 1 | 5 |
| Physarum Powered Differentiable Linear Programming Layers and Applications | Apr 30, 2020 | Few-Shot LearningMeta-Learning | CodeCode Available | 1 | 5 |
| TarViS: A Unified Approach for Target-based Video Segmentation | Jan 6, 2023 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 1 | 5 |
| A Survey on Deep Learning Technique for Video Segmentation | Jul 2, 2021 | Autonomous DrivingDeep Learning | CodeCode Available | 1 | 5 |
| Separable Convolutional LSTMs for Faster Video Segmentation | Jul 16, 2019 | GPUImage Segmentation | CodeCode Available | 1 | 5 |
| A Simple Video Segmenter by Tracking Objects Along Axial Trajectories | Nov 30, 2023 | GPUObject | CodeCode Available | 1 | 5 |
| MediViSTA: Medical Video Segmentation via Temporal Fusion SAM Adaptation for Echocardiography | Sep 24, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 1 | 5 |
| Differentiable Soft-Masked Attention | Jun 1, 2022 | ObjectSegmentation | CodeCode Available | 1 | 5 |
| Local-Global Context Aware Transformer for Language-Guided Video Segmentation | Mar 18, 2022 | Referring Expression SegmentationReferring Video Object Segmentation | CodeCode Available | 1 | 5 |
| Semantic Segmentation of Video Sequences with Convolutional LSTMs | May 3, 2019 | DecoderImage Segmentation | CodeCode Available | 1 | 5 |
| Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation | Nov 29, 2023 | ClusteringObject | CodeCode Available | 1 | 5 |
| Adversarial Pixel Restoration as a Pretext Task for Transferable Perturbations | Jul 18, 2022 | object-detectionObject Detection | CodeCode Available | 1 | 5 |
| AuxAdapt: Stable and Efficient Test-Time Adaptation for Temporally Consistent Video Semantic Segmentation | Oct 24, 2021 | Optical Flow EstimationSegmentation | CodeCode Available | 1 | 5 |
| AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace Platform | Dec 17, 2023 | Image SegmentationSegmentation | CodeCode Available | 1 | 5 |
| In-N-Out Generative Learning for Dense Unsupervised Video Segmentation | Mar 29, 2022 | Contrastive LearningSemantic Segmentation | CodeCode Available | 1 | 5 |
| Decoupled Seg Tokens Make Stronger Reasoning Video Segmenter and Grounder | Jun 28, 2025 | Image SegmentationLarge Language Model | CodeCode Available | 1 | 5 |
| Global Knowledge Calibration for Fast Open-Vocabulary Segmentation | Mar 16, 2023 | Knowledge DistillationOpen Vocabulary Semantic Segmentation | CodeCode Available | 1 | 5 |
| DC-SAM: In-Context Segment Anything in Images and Videos via Dual Consistency | Apr 16, 2025 | Few-Shot LearningInteractive Segmentation | CodeCode Available | 1 | 5 |
| GraphEcho: Graph-Driven Unsupervised Domain Adaptation for Echocardiogram Video Segmentation | Sep 20, 2023 | Domain AdaptationGraph Matching | CodeCode Available | 1 | 5 |
| SASVi - Segment Any Surgical Video | Feb 12, 2025 | SegmentationVideo Segmentation | CodeCode Available | 1 | 5 |
| SALI: Short-term Alignment and Long-term Interaction Network for Colonoscopy Video Polyp Segmentation | Jun 19, 2024 | SegmentationVideo Polyp Segmentation | CodeCode Available | 1 | 5 |
| Generic Event Boundary Detection: A Benchmark for Event Segmentation | Jan 26, 2021 | Action DetectionBoundary Detection | CodeCode Available | 1 | 5 |
| D2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos | Nov 15, 2021 | Multi-Object Tracking and SegmentationSegmentation | CodeCode Available | 1 | 5 |
| D^2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos | Nov 15, 2021 | SegmentationSemantic Segmentation | CodeCode Available | 1 | 5 |
| Actor and Action Video Segmentation from a Sentence | Mar 20, 2018 | Action SegmentationDecoder | CodeCode Available | 1 | 5 |
| SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost | Jun 2, 2025 | Image SegmentationSemantic Segmentation | CodeCode Available | 1 | 5 |
| Cross-Modal Progressive Comprehension for Referring Segmentation | May 15, 2021 | AttributeImage Segmentation | CodeCode Available | 1 | 5 |
| Dense Unsupervised Learning for Video Segmentation | Nov 11, 2021 | SegmentationSemantic Segmentation | CodeCode Available | 1 | 5 |
| Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks | Jan 17, 2025 | Few-Shot Semantic SegmentationSegmentation | CodeCode Available | 1 | 5 |
| Real-Time Video Inference on Edge Devices via Adaptive Model Streaming | Jun 11, 2020 | Knowledge DistillationSemantic Segmentation | CodeCode Available | 1 | 5 |
| Flow-based Video Segmentation for Human Head and Shoulders | Apr 20, 2021 | DecoderImage Matting | CodeCode Available | 1 | 5 |
| Making a Case for 3D Convolutions for Object Segmentation in Videos | Aug 26, 2020 | DecoderSegmentation | CodeCode Available | 1 | 5 |
| EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations | Sep 26, 2022 | ObjectSegmentation | CodeCode Available | 1 | 5 |
| Domain Adaptive Video Segmentation via Temporal Pseudo Supervision | Jul 6, 2022 | SegmentationSemantic Segmentation | CodeCode Available | 1 | 5 |
| Robust Semantic Segmentation in Adverse Weather Conditions by means of Fast Video-Sequence Segmentation | Jul 1, 2020 | Image SegmentationSegmentation | CodeCode Available | 1 | 5 |
| DVIS++: Improved Decoupled Framework for Universal Video Segmentation | Dec 20, 2023 | Contrastive LearningDenoising | CodeCode Available | 1 | 5 |
| Efficient Semantic Video Segmentation with Per-frame Inference | Feb 26, 2020 | Knowledge DistillationOptical Flow Estimation | CodeCode Available | 1 | 5 |
| Stochastic positional embeddings improve masked image modeling | Jul 31, 2023 | Language ModellingMasked Language Modeling | CodeCode Available | 1 | 5 |
| Context-Aware Relative Object Queries To Unify Video Instance and Panoptic Segmentation | Jan 1, 2023 | Instance SegmentationMulti-Object Tracking | CodeCode Available | 1 | 5 |
| General and Task-Oriented Video Segmentation | Jul 9, 2024 | DisentanglementDiversity | CodeCode Available | 1 | 5 |
| CamSAM2: Segment Anything Accurately in Camouflaged Videos | Mar 25, 2025 | Camouflaged Object SegmentationObject | CodeCode Available | 1 | 5 |
| Segmenting Moving Objects via an Object-Centric Layered Representation | Jul 5, 2022 | Instance SegmentationMotion Segmentation | CodeCode Available | 1 | 5 |