SAM 2: Segment Anything in Images and Videos Aug 1, 2024 Image Segmentation Robot Manipulation Generalization
Code Code Available 11Efficient Track Anything Nov 28, 2024 Object Segmentation
Code Code Available 7Segment Anything in Medical Images and Videos: Benchmark and Deployment Aug 6, 2024 Benchmarking Segmentation
Code Code Available 7The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation Apr 7, 2025 Inference Optimization Referring Video Object Segmentation
Code Code Available 54th PVUW MeViS 3rd Place Report: Sa2VA Apr 1, 2025 Language Modeling Language Modelling
Code Code Available 5Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Jan 7, 2025 2k Language Modeling
Code Code Available 5Underwater Camouflaged Object Tracking Meets Vision-Language SAM2 Sep 25, 2024 Object Object Tracking
Code Code Available 5Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey Aug 23, 2024 Image Segmentation Segmentation
Code Code Available 5OMG-Seg: Is One Model Good Enough For All Segmentation? Jan 18, 2024 All Decoder
Code Code Available 5MedSAM2: Segment Anything in 3D Medical Images and Videos Apr 4, 2025 Segmentation Video Segmentation
Code Code Available 4EdgeTAM: On-Device Track Anything Model Jan 13, 2025 model Video Segmentation
Code Code Available 4SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree Oct 21, 2024 Heuristic Search Object
Code Code Available 4PVUW 2024 Challenge on Complex Video Understanding: Methods and Results Jun 24, 2024 Segmentation Semantic Segmentation
Code Code Available 4SegGPT: Segmenting Everything In Context Apr 6, 2023 Few-Shot Semantic Segmentation In-Context Learning
Code Code Available 4SiamMask: A Framework for Fast Online Object Tracking and Segmentation Jul 5, 2022 Multiple Object Tracking Object
Code Code Available 4Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes Dec 2, 2024 In-Context Learning Video Segmentation
Code Code Available 3SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation Nov 26, 2024 Natural Language Understanding Referring Video Object Segmentation
Code Code Available 3SMITE: Segment Me In TimE Oct 24, 2024 Segmentation Semantic Segmentation
Code Code Available 3Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2 Aug 3, 2024 Diversity Segmentation
Code Code Available 3VISA: Reasoning Video Object Segmentation via Large Language Models Jul 16, 2024 Decoder Object
Code Code Available 3Moving Object Segmentation: All You Need Is SAM (and Flow) Apr 18, 2024 All Motion Segmentation
Code Code Available 3PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model Mar 21, 2024 Decoder Generalized Referring Expression Segmentation
Code Code Available 3UniVS: Unified and Universal Video Segmentation with Prompts as Queries Feb 28, 2024 Decoder Referring Expression Segmentation
Code Code Available 3RAP-SAM: Towards Real-Time All-Purpose Segment Anything Jan 18, 2024 All Decoder
Code Code Available 3Putting the Object Back into Video Object Segmentation Oct 19, 2023 Object Segmentation
Code Code Available 3Tracking Anything with Decoupled Video Segmentation Sep 7, 2023 Open-Vocabulary Video Segmentation Open-World Video Segmentation
Code Code Available 3VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation Aug 28, 2023 Instance Segmentation Optical Flow Estimation
Code Code Available 3Personalize Segment Anything Model with One Shot May 4, 2023 Image Generation model
Code Code Available 3XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model Jul 14, 2022 2D Human Pose Estimation 2D Object Detection
Code Code Available 3Min-Max Similarity: A Contrastive Semi-Supervised Deep Learning Network for Surgical Tools Segmentation Mar 29, 2022 Contrastive Learning Segmentation
Code Code Available 3VideoMolmo: Spatio-Temporal Grounding Meets Pointing Jun 5, 2025 Autonomous Driving Autonomous Navigation
Code Code Available 2Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration May 26, 2025 Domain Generalization Hallucination
Code Code Available 2GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation Apr 10, 2025 Contrastive Learning Language Modeling
Code Code Available 2Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation Mar 5, 2025 Object Referring Video Object Segmentation
Code Code Available 2HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver Jan 1, 2025 Reasoning Segmentation Segmentation
Code Code Available 2InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models Dec 18, 2024 Reasoning Segmentation Segmentation
Code Code Available 2Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity Dec 9, 2024 Anomaly Detection text annotation
Code Code Available 2Det-SAM2:Technical Report on the Self-Prompting Segmentation Framework Based on Segment Anything Model 2 Nov 28, 2024 Video Segmentation Video Semantic Segmentation
Code Code Available 2IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos Nov 18, 2024 Pose Estimation Semantic Segmentation
Code Code Available 2One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos Sep 29, 2024 All Image Segmentation
Code Code Available 2Self-Prompting Polyp Segmentation in Colonoscopy using Hybrid Yolo-SAM 2 Model Sep 14, 2024 Medical Image Segmentation Polyp Segmentation
Code Code Available 2Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation Aug 28, 2024 Object Semantic Segmentation
Code Code Available 2Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning Aug 15, 2024 Segmentation Video Segmentation
Code Code Available 2Zero-Shot Video Semantic Segmentation based on Pre-Trained Diffusion Models May 27, 2024 Segmentation Semantic correspondence
Code Code Available 2LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation Apr 30, 2024 Attribute Semantic Segmentation
Code Code Available 2Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object Segmentation Apr 21, 2024 Semantic Segmentation Video Object Segmentation
Code Code Available 2Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation Apr 4, 2024 Contrastive Learning Referring Expression
Code Code Available 2DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries Mar 29, 2024 Object Video Instance Segmentation
Code Code Available 2Efficient Video Object Segmentation via Modulated Cross-Attention Memory Mar 26, 2024 GPU Object
Code Code Available 2Vivim: a Video Vision Mamba for Medical Video Segmentation Jan 25, 2024 Lesion Segmentation Mamba
Code Code Available 2