SMITE: Segment Me In TimE

2024-10-24Code Available3· sign in to hype

Amirhossein Alimohammadi, Sauradip Nag, Saeid Asgari Taghanaki, Andrea Tagliasacchi, Ghassan Hamarneh, Ali Mahdavi Amiri

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/alimohammadiamirhossein/smite
Officialpytorch★ 212

Abstract

Segmenting an object in a video presents significant challenges. Each pixel must be accurately labelled, and these labels must remain consistent across frames. The difficulty increases when the segmentation is with arbitrary granularity, meaning the number of segments can vary arbitrarily, and masks are defined based on only one or a few sample images. In this paper, we address this issue by employing a pre-trained text to image diffusion model supplemented with an additional tracking mechanism. We demonstrate that our approach can effectively manage various segmentation scenarios and outperforms state-of-the-art alternatives.

Tasks

Segmentation Semantic Segmentation Video Object Segmentation Video Segmentation Video Semantic Segmentation

SMITE: Segment Me In TimE

Code

Abstract

Tasks

Reproductions