Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

2017-07-01CVPR 2017Unverified0· sign in to hype

Amaia Salvador, Nicholas Hynes, Yusuf Aytar, Javier Marin, Ferda Ofli, Ingmar Weber, Antonio Torralba

Unverified — Be the first to reproduce this paper.

Abstract

In this paper, we introduce Recipe1M, a new large-scale, structured corpus of over 1m cooking recipes and 800k food images. As the largest publicly available collection of recipe data, Recipe1M affords the ability to train high-capacity models on aligned, multi-modal data. Accordingly, we train a neural network to find a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Additionally, we demonstrate that regularization via the addition of a high-level, semantic classification objective improves performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipe1M dataset and food and cooking in general.

Tasks

General Classification Retrieval

Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

Abstract

Tasks

Reproductions