SOTAVerified

MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation

2024-01-01CVPR 2024Code Available2· sign in to hype

Xiaolong Deng, Huisi Wu, Runhao Zeng, Jing Qin

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We propose a novel echocardiographical video segmentation model by adapting SAM to medical videos to address some long-standing challenges in ultrasound video segmentation including (1) massive speckle noise and artifacts (2) extremely ambiguous boundaries and (3) large variations of targeting objects across frames. The core technique of our model is a temporal-aware and noise-resilient prompting scheme. Specifically we employ a space-time memory that contains both spatial and temporal information to prompt the segmentation of current frame and thus we call the proposed model as MemSAM. In prompting the memory carrying temporal cues sequentially prompt the video segmentation frame by frame. Meanwhile as the memory prompt propagates high-level features it avoids the issue of misidentification caused by mask propagation and improves representation consistency. To address the challenge of speckle noise we further propose a memory reinforcement mechanism which leverages predicted masks to improve the quality of the memory before storing it. We extensively evaluate our method on two public datasets and demonstrate state-of-the-art performance compared to existing models. Particularly our model achieves comparable performance with fully supervised approaches with limited annotations. Codes are available at https://github.com/dengxl0520/MemSAM.

Tasks

Reproductions