Auto-Encoding for Shared Cross Domain Feature Representation and Image-to-Image Translation
Safalya Pal
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Image-to-image translation is a subset of computer vision and pattern recognition problems where our goal is to learn a mapping between input images of domain X_1 and output images of domain X_2. Current methods use neural networks with an encoder-decoder structure to learn a mapping G:X_1 X_2 such that the distribution of images from X_2 and G(X_1) are identical, where G(X_1) = d_G (f_G (X_1)) and f_G () is referred as the encoder and d_G() is referred to as the decoder. Currently, such methods which also compute an inverse mapping F:X_2 X_1 use a separate encoder-decoder pair d_F (f_F (X_2)) or at least a separate decoder d_F () to do so. Here we introduce a method to perform cross domain image-to-image translation across multiple domains using a single encoder-decoder architecture. We use an auto-encoder network which given an input image X_1, first computes a latent domain encoding Z_d = f_d (X_1) and a latent content encoding Z_c = f_c (X_1), where the domain encoding Z_d and content encoding Z_c are independent. And then a decoder network g(Z_d,Z_c) creates a reconstruction of the original image X_1=g(Z_d,Z_c ) X_1. Ideally, the domain encoding Z_d contains no information regarding the content of the image and the content encoding Z_c contains no information regarding the domain of the image. We use this property of the encodings to find the mapping across domains G: X Y by simply changing the domain encoding Z_d of the decoder's input. G(X_1 )=d(f_d (x_2^i ),f_c (X_1)) where x_2^i is the i^th observation of X_2.