Learning Generative Image Object Manipulations from Language Instructions

2020-01-01ICLR 2020Unverified0· sign in to hype

Martin Längkvist, Andreas Persson, Amy Loutfi

Unverified — Be the first to reproduce this paper.

Abstract

The use of adequate feature representations is essential for achieving high performance in high-level human cognitive tasks in computational modeling. Recent developments in deep convolutional and recurrent neural networks architectures enable learning powerful feature representations from both images and natural language text. Besides, other types of networks such as Relational Networks (RN) can learn relations between objects and Generative Adversarial Networks (GAN) have shown to generate realistic images. In this paper, we combine these four techniques to acquire a shared feature representation of the relation between objects in an input image and an object manipulation action description in the form of human language encodings to generate an image that shows the resulting end-effect the action would have on a computer-generated scene. The system is trained and evaluated on a simulated dataset and experimentally used on real-world photos.

Tasks

Object

Learning Generative Image Object Manipulations from Language Instructions

Abstract

Tasks

Reproductions