| When SAM2 Meets Video Shadow and Mirror Detection | Dec 26, 2024 | Image SegmentationMirror Detection | CodeCode Available | 0 |
| InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models | Dec 18, 2024 | Reasoning SegmentationSegmentation | CodeCode Available | 2 |
| Collaborative Hybrid Propagator for Temporal Misalignment in Audio-Visual Segmentation | Dec 11, 2024 | Video SegmentationVideo Semantic Segmentation | —Unverified | 0 |
| Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity | Dec 9, 2024 | Anomaly Detectiontext annotation | CodeCode Available | 2 |
| Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes | Dec 2, 2024 | In-Context LearningVideo Segmentation | CodeCode Available | 3 |
| Multi-Granularity Video Object Segmentation | Dec 2, 2024 | ObjectSegmentation | CodeCode Available | 1 |
| Det-SAM2:Technical Report on the Self-Prompting Segmentation Framework Based on Segment Anything Model 2 | Nov 28, 2024 | Video SegmentationVideo Semantic Segmentation | CodeCode Available | 2 |
| Efficient Track Anything | Nov 28, 2024 | ObjectSegmentation | CodeCode Available | 7 |
| RoMo: Robust Motion Segmentation Improves Structure from Motion | Nov 27, 2024 | Camera CalibrationMotion Segmentation | —Unverified | 0 |
| SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation | Nov 26, 2024 | Natural Language UnderstandingReferring Video Object Segmentation | CodeCode Available | 3 |
| Geometric Algebra Planes: Convex Implicit Neural Volumes | Nov 20, 2024 | DecoderVideo Segmentation | —Unverified | 0 |
| Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level | Nov 15, 2024 | Benchmarkingcounterfactual | —Unverified | 0 |
| Zero-shot capability of SAM-family models for bone segmentation in CT scans | Nov 13, 2024 | Image SegmentationMedical Image Segmentation | —Unverified | 0 |
| MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data | Nov 12, 2024 | SegmentationUncertainty Quantification | CodeCode Available | 0 |
| GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting | Nov 12, 2024 | 3DGSgraph construction | —Unverified | 0 |
| Breaking The Ice: Video Segmentation for Close-Range Ice-Covered Waters | Nov 7, 2024 | Image SegmentationOptical Flow Estimation | —Unverified | 0 |
| VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos | Nov 7, 2024 | DecoderLanguage Modeling | —Unverified | 0 |
| SMITE: Segment Me In TimE | Oct 24, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 3 |
| VideoSAM: A Large Vision Foundation Model for High-Speed Video Segmentation | Oct 22, 2024 | SegmentationVideo Segmentation | CodeCode Available | 0 |
| SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree | Oct 21, 2024 | Heuristic SearchObject | CodeCode Available | 4 |
| Temporal-Enhanced Multimodal Transformer for Referring Multi-Object Tracking and Segmentation | Oct 17, 2024 | Multi-Object TrackingMulti-Object Tracking and Segmentation | —Unverified | 0 |
| Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation | Oct 16, 2024 | BenchmarkingPanoptic Segmentation | —Unverified | 0 |
| VideoSAM: Open-World Video Segmentation | Oct 11, 2024 | Autonomous DrivingDecoder | —Unverified | 0 |
| Shift and matching queries for video semantic segmentation | Oct 10, 2024 | Image SegmentationSegmentation | —Unverified | 0 |
| Underwater Camouflaged Object Tracking Meets Vision-Language SAM2 | Sep 25, 2024 | ObjectObject Tracking | CodeCode Available | 5 |
| Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision | Sep 14, 2024 | Video SegmentationVideo Semantic Segmentation | —Unverified | 0 |
| Self-Prompting Polyp Segmentation in Colonoscopy using Hybrid Yolo-SAM 2 Model | Sep 14, 2024 | Medical Image SegmentationPolyp Segmentation | CodeCode Available | 2 |
| LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation | Sep 9, 2024 | ObjectReferring Video Object Segmentation | —Unverified | 0 |
| Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey | Aug 23, 2024 | Image SegmentationSegmentation | CodeCode Available | 5 |
| Rethinking Video Segmentation with Masked Video Consistency: Did the Model Learn as Intended? | Aug 20, 2024 | Image SegmentationSegmentation | —Unverified | 0 |
| Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track | Aug 19, 2024 | ObjectSegmentation | —Unverified | 0 |
| Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning | Aug 15, 2024 | SegmentationVideo Segmentation | CodeCode Available | 2 |
| Novel adaptation of video segmentation to 3D MRI: efficient zero-shot knee segmentation with SAM2 | Aug 8, 2024 | Image SegmentationMedical Image Analysis | —Unverified | 0 |
| Is SAM 2 Better than SAM in Medical Image Segmentation? | Aug 8, 2024 | Image SegmentationMedical Image Segmentation | —Unverified | 0 |
| Saliency Detection in Educational Videos: Analyzing the Performance of Current Models, Identifying Limitations and Advancement Directions | Aug 8, 2024 | Information RetrievalSaliency Detection | —Unverified | 0 |
| SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation | Aug 8, 2024 | DecoderInteractive Segmentation | —Unverified | 0 |
| Performance and Non-adversarial Robustness of the Segment Anything Model 2 in Surgical Video Segmentation | Aug 7, 2024 | Adversarial RobustnessImage Segmentation | —Unverified | 0 |
| Segment Anything in Medical Images and Videos: Benchmark and Deployment | Aug 6, 2024 | BenchmarkingSegmentation | CodeCode Available | 7 |
| Biomedical SAM 2: Segment Anything in Biomedical Images and Videos | Aug 6, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 0 |
| Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2 | Aug 3, 2024 | DiversitySegmentation | CodeCode Available | 3 |
| SAM 2: Segment Anything in Images and Videos | Aug 1, 2024 | Image SegmentationRobot Manipulation Generalization | CodeCode Available | 11 |
| ViLLa: Video Reasoning Segmentation with Large Language Model | Jul 18, 2024 | Image SegmentationLanguage Modeling | CodeCode Available | 1 |
| FoodMem: Near Real-time and Precise Food Video Segmentation | Jul 16, 2024 | SegmentationSemantic Segmentation | —Unverified | 0 |
| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 |
| General and Task-Oriented Video Segmentation | Jul 9, 2024 | DisentanglementDiversity | CodeCode Available | 1 |
| Uni-DVPS: Unified Model for Depth-Aware Video Panoptic Segmentation | Jul 1, 2024 | Autonomous DrivingDecoder | CodeCode Available | 1 |
| DaBiT: Depth and Blur informed Transformer for Joint Refocusing and Super-Resolution | Jul 1, 2024 | DeblurringSuper-Resolution | CodeCode Available | 0 |
| Deep Unfolding-Aided Parameter Tuning for Plug-and-Play-Based Video Snapshot Compressive Imaging | Jun 28, 2024 | DenoisingVideo Segmentation | —Unverified | 0 |
| MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation | Jun 27, 2024 | Anomaly DetectionGraph Generation | —Unverified | 0 |
| PVUW 2024 Challenge on Complex Video Understanding: Methods and Results | Jun 24, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 4 |