| Multi-Granularity Video Object Segmentation | Dec 2, 2024 | ObjectSegmentation | CodeCode Available | 1 |
| Efficient Track Anything | Nov 28, 2024 | ObjectSegmentation | CodeCode Available | 7 |
| Track Anything Behind Everything: Zero-Shot Amodal Video Object Segmentation | Nov 28, 2024 | 3D ReconstructionSegmentation | —Unverified | 0 |
| HyperSeg: Towards Universal Visual Segmentation with Large Language Model | Nov 26, 2024 | Language ModelingLarge Language Model | CodeCode Available | 2 |
| SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation | Nov 26, 2024 | Natural Language UnderstandingReferring Video Object Segmentation | CodeCode Available | 3 |
| ClickTrack: Towards Real-time Interactive Single Object Tracking | Nov 20, 2024 | ObjectObject Tracking | —Unverified | 0 |
| IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos | Nov 18, 2024 | Pose EstimationSemantic Segmentation | CodeCode Available | 2 |
| LiVOS: Light Video Object Segmentation with Gated Linear Matching | Nov 5, 2024 | GPUSemantic Segmentation | CodeCode Available | 1 |
| Continuous Spatio-Temporal Memory Networks for 4D Cardiac Cine MRI Segmentation | Oct 30, 2024 | AnatomyMRI segmentation | CodeCode Available | 0 |
| Addressing Issues with Working Memory in Video Object Segmentation | Oct 29, 2024 | Inductive BiasObject | —Unverified | 0 |
| SMITE: Segment Me In TimE | Oct 24, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 3 |
| SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree | Oct 21, 2024 | Heuristic SearchObject | CodeCode Available | 4 |
| One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos | Sep 29, 2024 | AllImage Segmentation | CodeCode Available | 2 |
| X-Prompt: Multi-modal Visual Prompt for Video Object Segmentation | Sep 28, 2024 | Semantic SegmentationVideo Object Segmentation | CodeCode Available | 1 |
| Memory Matching is not Enough: Jointly Improving Memory Matching and Decoding for Video Object Segmentation | Sep 22, 2024 | Semantic SegmentationSemi-Supervised Video Object Segmentation | —Unverified | 0 |
| LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation | Sep 9, 2024 | ObjectReferring Video Object Segmentation | —Unverified | 0 |
| Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOS | Aug 29, 2024 | ObjectObject Recognition | CodeCode Available | 0 |
| Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation | Aug 28, 2024 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| CSS-Segment: 2nd Place Report of LSVOS Challenge VOS Track | Aug 24, 2024 | Autonomous DrivingObject | —Unverified | 0 |
| The 2nd Solution for LSVOS Challenge RVOS Track: Spatial-temporal Refinement for Consistent Semantic Segmentation | Aug 22, 2024 | Referring Video Object SegmentationSegmentation | —Unverified | 0 |
| The Instance-centric Transformer for the RVOS Track of LSVOS Challenge: 3rd Place Solution | Aug 20, 2024 | Referring Video Object SegmentationRetrieval | —Unverified | 0 |
| LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS | Aug 20, 2024 | Instance SegmentationObject | —Unverified | 0 |
| 3D-Aware Instance Segmentation and Tracking in Egocentric Videos | Aug 19, 2024 | 3D Object ReconstructionInstance Segmentation | —Unverified | 0 |
| Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track | Aug 19, 2024 | ObjectSegmentation | —Unverified | 0 |
| UNINEXT-Cutie: The 1st Solution for LSVOS Challenge RVOS Track | Aug 19, 2024 | Referring Video Object SegmentationSemantic Segmentation | —Unverified | 0 |
| Fast Sprite Decomposition from Animated Graphics | Aug 7, 2024 | Semantic SegmentationVideo Object Segmentation | —Unverified | 0 |
| Biomedical SAM 2: Segment Anything in Biomedical Images and Videos | Aug 6, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 0 |
| SAM 2: Segment Anything in Images and Videos | Aug 1, 2024 | Image SegmentationRobot Manipulation Generalization | CodeCode Available | 11 |
| Strike the Balance: On-the-Fly Uncertainty based User Interactions for Long-Term Video Object Segmentation | Jul 31, 2024 | ObjectSegmentation | CodeCode Available | 0 |
| Disentangling spatio-temporal knowledge for weakly supervised object detection and segmentation in surgical video | Jul 22, 2024 | DisentanglementKnowledge Distillation | CodeCode Available | 0 |
| Improving Unsupervised Video Object Segmentation via Fake Flow Generation | Jul 16, 2024 | Objectobject-detection | —Unverified | 0 |
| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 |
| ActionVOS: Actions as Prompts for Video Object Segmentation | Jul 10, 2024 | ObjectReferring Video Object Segmentation | CodeCode Available | 1 |
| Learning Spatial-Semantic Features for Robust Video Object Segmentation | Jul 10, 2024 | ObjectSemantic Segmentation | —Unverified | 0 |
| Rethinking Image-to-Video Adaptation: An Object-centric Perspective | Jul 9, 2024 | Action RecognitionObject | —Unverified | 0 |
| Context Propagation from Proposals for Semantic Video Object Segmentation | Jul 8, 2024 | ObjectSegmentation | —Unverified | 0 |
| Submodular video object proposal selection for semantic object segmentation | Jul 8, 2024 | ObjectSegmentation | —Unverified | 0 |
| Non-parametric Contextual Relationship Learning for Semantic Video Object Segmentation | Jul 8, 2024 | Semantic SegmentationVideo Object Segmentation | —Unverified | 0 |
| Video Inpainting Localization with Contrastive Learning | Jun 25, 2024 | Contrastive LearningDecoder | CodeCode Available | 1 |
| PVUW 2024 Challenge on Complex Video Understanding: Methods and Results | Jun 24, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 4 |
| 2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation | Jun 20, 2024 | Instance SegmentationReferring Video Object Segmentation | —Unverified | 0 |
| Trusted Video Inpainting Localization via Deep Attentive Noise Learning | Jun 19, 2024 | Semantic SegmentationVideo Inpainting | CodeCode Available | 0 |
| ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection | Jun 18, 2024 | object-detectionObject Detection | CodeCode Available | 0 |
| GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation | Jun 18, 2024 | Contrastive LearningObject | —Unverified | 0 |
| 2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation | Jun 12, 2024 | Instance SegmentationSemantic Segmentation | —Unverified | 0 |
| RMem: Restricted Memory Banks Improve Video Object Segmentation | Jun 12, 2024 | ObjectSemantic Segmentation | —Unverified | 0 |
| 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation | Jun 11, 2024 | Referring Video Object SegmentationSegmentation | CodeCode Available | 1 |
| Training-Free Robust Interactive Video Object Segmentation | Jun 8, 2024 | Interactive Video Object SegmentationObject | —Unverified | 0 |
| 3rd Place Solution for MeViS Track in CVPR 2024 PVUW workshop: Motion Expression guided Video Segmentation | Jun 7, 2024 | Referring Video Object SegmentationSemantic Segmentation | —Unverified | 0 |
| A Semi-Self-Supervised Approach for Dense-Pattern Video Object Segmentation | Jun 7, 2024 | Multi-Task LearningObject | —Unverified | 0 |