Facebook Inverse Cooking Algorithm

Original article was published on Deep Learning on Medium

Facebook Inverse Cooking Algorithm

Predicting a full recipe from an image better than humans

Figure 1 Predicted ingredients after running Inverse Cooking algorithm in a meal of sushi.

This recipe retrieval algorithm was developed by Facebook AI Research and it is able to predict ingredients, cooking instructions and a title for a recipe, directly from an image (Figure 2) [1].

Figure 2 Example of a generated recipe by the Inverse Cooking Algorithm

In the past, algorithms have been using simple systems of recipe retrieval based on image similarities in an embedding space. This approach is highly dependent on the quality of the learned embedding, dataset size and variability. Therefore, these approaches fail when there is no match between the input image and the static dataset [1].

Inverse cooking algorithm instead of retrieving a recipe directly from an image, proposes a pipeline with an intermediate step where the set of ingredients is first obtained. This allows the generation of the instructions not only taking into account the image, but also the ingredients (Figure 1) [1].

Figure 3 Inverse Cooking recipe generation model with the multiple encoders and decoders, generating the cooking instructions [1]

One of the major achievements of this method was to present higher accuracy than a baseline recipe retrieval system [2] and average human [1], while trying to predict the ingredients from an image.

Figure 4 Left: IoU and F1 scores for ingredients obtained with retrieval approach [2], Facebook’s method (Ours) and humans. Right: Recipe success rate according to human judgment [1]

Inverse Cooking algorithm was included in a food recommendation system app developed and published here. Based on the predicted ingredients in the web application, several suggestions are provided to the user, such as: different ingredient combinations (Figure 1).


[1] A. Salvador, M. Drozdzal, X. Giro-i-Nieto and A. Romero, “Inverse Cooking: Recipe Generation from Food Images,” Computer Vision and Pattern Recognition, 2018.

[2] A. Salvador, N. Hynes, Y. Aytar, J. Marin, F. Ofli, I. Weber and A. Torralba, “Learning cross-modal embeddings for cooking recipes and food images,” Computer Vision and Pattern Recognition, 2017.