| 1st Place Solution for MOSE Track in CVPR 2024 PVUW Workshop: Complex Video Object Segmentation | Jun 7, 2024 | ObjectSegmentation | —Unverified | 0 |
| 3rd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation | Jun 6, 2024 | ObjectPosition | —Unverified | 0 |
| Lifelong Learning Using a Dynamically Growing Tree of Sub-networks for Domain Generalization in Video Object Segmentation | May 29, 2024 | Domain GeneralizationLifelong learning | —Unverified | 0 |
| One-shot Training for Video Object Segmentation | May 22, 2024 | ObjectSemantic Segmentation | —Unverified | 0 |
| Harnessing Vision-Language Pretrained Models with Temporal-Aware Adaptation for Referring Video Object Segmentation | May 17, 2024 | Referring Expression SegmentationReferring Video Object Segmentation | —Unverified | 0 |
| DeVOS: Flow-Guided Deformable Transformer for Video Object Segmentation | May 11, 2024 | Optical Flow EstimationSemantic Segmentation | —Unverified | 0 |
| Global Motion Understanding in Large-Scale Video Object Segmentation | May 11, 2024 | Instance SegmentationOptical Flow Estimation | —Unverified | 0 |
| Space-time Reinforcement Network for Video Object Segmentation | May 7, 2024 | ObjectSemantic Segmentation | —Unverified | 0 |
| LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation | Apr 30, 2024 | AttributeSemantic Segmentation | CodeCode Available | 2 |
| 360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos | Apr 22, 2024 | ObjectObject Tracking | —Unverified | 0 |
| Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object Segmentation | Apr 21, 2024 | Semantic SegmentationVideo Object Segmentation | CodeCode Available | 2 |
| Moving Object Segmentation: All You Need Is SAM (and Flow) | Apr 18, 2024 | AllMotion Segmentation | CodeCode Available | 3 |
| Spatial-Temporal Multi-level Association for Video Object Segmentation | Apr 9, 2024 | ObjectSegmentation | —Unverified | 0 |
| Event-assisted Low-Light Video Object Segmentation | Apr 2, 2024 | ObjectSemantic Segmentation | CodeCode Available | 1 |
| Temporally Consistent Referring Video Object Segmentation with Hybrid Memory | Mar 28, 2024 | HTRObject | CodeCode Available | 1 |
| Annolid: Annotate, Segment, and Track Anything You Need | Mar 27, 2024 | Instance SegmentationSegmentation | CodeCode Available | 0 |
| Efficient Video Object Segmentation via Modulated Cross-Attention Memory | Mar 26, 2024 | GPUObject | CodeCode Available | 2 |
| PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model | Mar 21, 2024 | DecoderGeneralized Referring Expression Segmentation | CodeCode Available | 3 |
| Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation | Mar 18, 2024 | Referring Video Object SegmentationSemantic Segmentation | CodeCode Available | 1 |
| Video Object Segmentation with Dynamic Query Modulation | Mar 18, 2024 | ObjectSegmentation | CodeCode Available | 1 |
| OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework | Mar 13, 2024 | AllManagement | —Unverified | 0 |
| Augmenting Efficient Real-time Surgical Instrument Segmentation in Video with Point Tracking and Segment Anything | Mar 12, 2024 | GPUPoint Tracking | CodeCode Available | 1 |
| ClickVOS: Click Video Object Segmentation | Mar 10, 2024 | ObjectSegmentation | CodeCode Available | 0 |
| Depth-aware Test-Time Training for Zero-shot Video Object Segmentation | Mar 7, 2024 | Depth EstimationDepth Prediction | CodeCode Available | 1 |
| VideoMAC: Video Masked Autoencoders Meet ConvNets | Feb 29, 2024 | Pose TrackingRepresentation Learning | CodeCode Available | 1 |
| UniVS: Unified and Universal Video Segmentation with Prompts as Queries | Feb 28, 2024 | DecoderReferring Expression Segmentation | CodeCode Available | 3 |
| Lester: rotoscope animation through video object segmentation and tracking | Feb 15, 2024 | 3D Human Pose EstimationObject | CodeCode Available | 1 |
| Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation | Feb 14, 2024 | DecoderObject | —Unverified | 0 |
| Point-VOS: Pointing Up Video Object Segmentation | Feb 8, 2024 | ObjectSemantic Segmentation | —Unverified | 0 |
| Is Two-shot All You Need? A Label-efficient Approach for Video Segmentation in Breast Ultrasound | Feb 7, 2024 | AllLesion Segmentation | —Unverified | 0 |
| Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention | Jan 25, 2024 | Knowledge DistillationObject | —Unverified | 0 |
| Vivim: a Video Vision Mamba for Medical Video Segmentation | Jan 25, 2024 | Lesion SegmentationMamba | CodeCode Available | 2 |
| Explore Synergistic Interaction Across Frames for Interactive Video Object Segmentation | Jan 23, 2024 | Interactive Video Object SegmentationSemantic Segmentation | —Unverified | 0 |
| Understanding Video Transformers via Universal Concept Discovery | Jan 19, 2024 | Action RecognitionDecision Making | —Unverified | 0 |
| OMG-Seg: Is One Model Good Enough For All Segmentation? | Jan 18, 2024 | AllDecoder | CodeCode Available | 5 |
| Learning to Segment Referred Objects from Narrated Egocentric Videos | Jan 1, 2024 | ObjectSegmentation | —Unverified | 0 |
| 1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation | Jan 1, 2024 | ObjectReferring Video Object Segmentation | CodeCode Available | 1 |
| Tracking with Human-Intent Reasoning | Dec 29, 2023 | Language ModellingObject | CodeCode Available | 1 |
| UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces | Dec 25, 2023 | Image SegmentationObject | CodeCode Available | 2 |
| No More Shortcuts: Realizing the Potential of Temporal Self-Supervision | Dec 20, 2023 | Action ClassificationAttribute | —Unverified | 0 |
| Hierarchical Graph Pattern Understanding for Zero-Shot VOS | Dec 15, 2023 | DecoderGraph Neural Network | CodeCode Available | 0 |
| General Object Foundation Model for Images and Videos at Scale | Dec 14, 2023 | Instance SegmentationLong-tail Video Object Segmentation | CodeCode Available | 3 |
| TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking | Dec 13, 2023 | Semantic SegmentationVideo Object Segmentation | —Unverified | 0 |
| Semi-supervised Active Learning for Video Action Detection | Dec 12, 2023 | Action DetectionActive Learning | CodeCode Available | 0 |
| Flexible visual prompts for in-context learning in computer vision | Dec 11, 2023 | Image SegmentationIn-Context Learning | CodeCode Available | 0 |
| SimulFlow: Simultaneously Extracting Feature and Identifying Target for Unsupervised Video Object Segmentation | Nov 30, 2023 | Objectobject-detection | —Unverified | 0 |
| VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models | Nov 30, 2023 | Semantic SegmentationVideo Editing | —Unverified | 0 |
| Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation | Nov 29, 2023 | ClusteringObject | CodeCode Available | 1 |
| SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation | Nov 24, 2023 | Meta-LearningOne-Shot Segmentation | CodeCode Available | 1 |
| Sketch-based Video Object Segmentation: Benchmark and Analysis | Nov 13, 2023 | ObjectSegmentation | —Unverified | 0 |