SAM 2: Segment Anything in Images and Videos Aug 1, 2024 Image Segmentation Robot Manipulation Generalization
Code Code Available 115 Segment Anything in Medical Images and Videos: Benchmark and Deployment Aug 6, 2024 Benchmarking Segmentation
Code Code Available 75 Efficient Track Anything Nov 28, 2024 Object Segmentation
Code Code Available 75 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Jan 7, 2025 2k Language Modeling
Code Code Available 55 OMG-Seg: Is One Model Good Enough For All Segmentation? Jan 18, 2024 All Decoder
Code Code Available 55 Underwater Camouflaged Object Tracking Meets Vision-Language SAM2 Sep 25, 2024 Object Object Tracking
Code Code Available 55 4th PVUW MeViS 3rd Place Report: Sa2VA Apr 1, 2025 Language Modeling Language Modelling
Code Code Available 55 Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey Aug 23, 2024 Image Segmentation Segmentation
Code Code Available 55 The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation Apr 7, 2025 Inference Optimization Referring Video Object Segmentation
Code Code Available 55 SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree Oct 21, 2024 Heuristic Search Object
Code Code Available 45 MedSAM2: Segment Anything in 3D Medical Images and Videos Apr 4, 2025 Segmentation Video Segmentation
Code Code Available 45 SegGPT: Segmenting Everything In Context Apr 6, 2023 Few-Shot Semantic Segmentation In-Context Learning
Code Code Available 45 EdgeTAM: On-Device Track Anything Model Jan 13, 2025 model Video Segmentation
Code Code Available 45 SiamMask: A Framework for Fast Online Object Tracking and Segmentation Jul 5, 2022 Multiple Object Tracking Object
Code Code Available 45 PVUW 2024 Challenge on Complex Video Understanding: Methods and Results Jun 24, 2024 Segmentation Semantic Segmentation
Code Code Available 45 SMITE: Segment Me In TimE Oct 24, 2024 Segmentation Semantic Segmentation
Code Code Available 35 SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation Nov 26, 2024 Natural Language Understanding Referring Video Object Segmentation
Code Code Available 35 RAP-SAM: Towards Real-Time All-Purpose Segment Anything Jan 18, 2024 All Decoder
Code Code Available 35 Personalize Segment Anything Model with One Shot May 4, 2023 Image Generation model
Code Code Available 35 Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2 Aug 3, 2024 Diversity Segmentation
Code Code Available 35 XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model Jul 14, 2022 2D Human Pose Estimation 2D Object Detection
Code Code Available 35 Min-Max Similarity: A Contrastive Semi-Supervised Deep Learning Network for Surgical Tools Segmentation Mar 29, 2022 Contrastive Learning Segmentation
Code Code Available 35 VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation Aug 28, 2023 Instance Segmentation Optical Flow Estimation
Code Code Available 35 PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model Mar 21, 2024 Decoder Generalized Referring Expression Segmentation
Code Code Available 35 Putting the Object Back into Video Object Segmentation Oct 19, 2023 Object Segmentation
Code Code Available 35 Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes Dec 2, 2024 In-Context Learning Video Segmentation
Code Code Available 35 VISA: Reasoning Video Object Segmentation via Large Language Models Jul 16, 2024 Decoder Object
Code Code Available 35 UniVS: Unified and Universal Video Segmentation with Prompts as Queries Feb 28, 2024 Decoder Referring Expression Segmentation
Code Code Available 35 Tracking Anything with Decoupled Video Segmentation Sep 7, 2023 Open-Vocabulary Video Segmentation Open-World Video Segmentation
Code Code Available 35 Moving Object Segmentation: All You Need Is SAM (and Flow) Apr 18, 2024 All Motion Segmentation
Code Code Available 35 Self-Prompting Polyp Segmentation in Colonoscopy using Hybrid Yolo-SAM 2 Model Sep 14, 2024 Medical Image Segmentation Polyp Segmentation
Code Code Available 25 Audio-Visual Segmentation with Semantics Jan 30, 2023 Segmentation Semantic Segmentation
Code Code Available 25 Decoupling Features in Hierarchical Propagation for Video Object Segmentation Oct 18, 2022 Object Semantic Segmentation
Code Code Available 25 Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning Aug 15, 2024 Segmentation Video Segmentation
Code Code Available 25 MOSE: A New Dataset for Video Object Segmentation in Complex Scenes Feb 3, 2023 Object Segmentation
Code Code Available 25 Scalable Video Object Segmentation with Identification Mechanism Mar 22, 2022 Object Segmentation
Code Code Available 25 MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions Aug 16, 2023 Motion Expressions Guided Video Segmentation Object
Code Code Available 25 Mask2Former for Video Instance Segmentation Dec 20, 2021 Image Segmentation Instance Segmentation
Code Code Available 25 Language as Queries for Referring Video Object Segmentation Jan 3, 2022 Object Object Tracking
Code Code Available 25 InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models Dec 18, 2024 Reasoning Segmentation Segmentation
Code Code Available 25 LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation Apr 30, 2024 Attribute Semantic Segmentation
Code Code Available 25 MCIBI++: Soft Mining Contextual Information Beyond Image for Semantic Segmentation Sep 9, 2022 Segmentation Semantic Segmentation
Code Code Available 25 MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation Jan 1, 2024 Segmentation Video Segmentation
Code Code Available 25 Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration May 26, 2025 Domain Generalization Hallucination
Code Code Available 25 IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos Nov 18, 2024 Pose Estimation Semantic Segmentation
Code Code Available 25 HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver Jan 1, 2025 Reasoning Segmentation Segmentation
Code Code Available 25 In Defense of Online Models for Video Instance Segmentation Jul 21, 2022 Contrastive Learning Instance Segmentation
Code Code Available 25 GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation Apr 10, 2025 Contrastive Learning Language Modeling
Code Code Available 25 Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation Mar 5, 2025 Object Referring Video Object Segmentation
Code Code Available 25 Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity Dec 9, 2024 Anomaly Detection text annotation
Code Code Available 25