SynthRef: Generation of Synthetic Referring Expressions for Object Segmentation

2021-06-08Code Available1· sign in to hype

Ioannis Kazakos, Carles Ventura, Miriam Bellver, Carina Silberer, Xavier Giro-i-Nieto

Code Available — Be the first to reproduce this paper.

Code

github.com/imatge-upc/synthref
Officialpytorch★ 2
github.com/miriambellver/refvos
pytorch★ 28

Abstract

Recent advances in deep learning have brought significant progress in visual grounding tasks such as language-guided video object segmentation. However, collecting large datasets for these tasks is expensive in terms of annotation time, which represents a bottleneck. To this end, we propose a novel method, namely SynthRef, for generating synthetic referring expressions for target objects in an image (or video frame), and we also present and disseminate the first large-scale dataset with synthetic referring expressions for video object segmentation. Our experiments demonstrate that by training with our synthetic referring expressions one can improve the ability of a model to generalize across different datasets, without any additional annotation cost. Moreover, our formulation allows its application to any object detection or segmentation dataset.

Tasks

Object object-detection Referring Expression Segmentation Segmentation Video Object Segmentation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
DAVIS 2017 (val)	RefVOS + SynthRef-YouTube-VIS	J&F 1st frame	45.3	—	Unverified
Refer-YouTube-VOS	RefVOS-Human REs	Mean IoU	39.5	—	Unverified
Refer-YouTube-VOS	RefVOS-Synthetic REs	Mean IoU	35	—	Unverified

SynthRef: Generation of Synthetic Referring Expressions for Object Segmentation

Code

Abstract

Tasks

Benchmark Results

Reproductions