Original article was published on Deep Learning on Medium
Building the Eat or No Eat AI for Managing Weight Loss
In a previous blog post I wrote about building an AI that could help users track their eating habits. In the concept I proposed using reinforcement learning agents and rewards to teach an AI which foods a user could or should not eat. The user would feed the agent pictures of food, perhaps captured via cell camera, the agent would respond with an Eat or No Eat. Eat signalling the user could eat the food, while No Eat would suggest they avoid the food. In addition to this, as the user used the app they could train it to their specific needs and foods. The original blog post from 2018 is below:
The original blog post received a lot of attention but not enough interest for someone to build the app. While I believed the concept was sound, what I originally considered the biggest hurdle was managing huge amounts of data. Typically, reinforcement and deep learning AI requires thousands or millions of images or events to train effectively. However, after a couple years of building deep learning apps and writing about them I came up with a plan to tackle the problem. This plan came about as part of writing my new book for O’Reilly titled Practical AI on the Google Cloud Platform. In the book I teach the reader to use various tools and techniques provided by the Google Cloud Platform to build practical AI. As I neared the end of writing this book I needed a final app that would combine multiple elements into one project. What I decided on was revisiting the old blog post about the Eat/No Eat app and it turned out to be a good idea.
Building the AI/App
In order to build the AI/App I needed to combine multiple strategies and techniques from deep learning, transfer learning, reinforcement learning and deep reinforcement learning. All the details of how to use and combine these techniques are detailed in the book. What I wanted to do in this blog post was to show how I built the app and demonstrate interesting sections of code. I have also provided all of the code for building the app in GitHub repository:
The first hurdle in developing any AI is sourcing good data. My issue was finding a good image dataset of food labelled with nutritional information. As it turns out there isn’t a good dataset that fits this requirement. However, there are a number of food image datasets annotated with recipe information, ingredients or category. Using a food image set with recipes sounded ideal but in the end this just complicated matters. What ultimately worked was just finding a food dataset with categories that could be generalized nutritionally. That way as long as I had a deep learning network that could recognize a food’s category than it could generalize the food’s nutrition.
There are a couple of good food datasets that generalize food to 101 categories. The more popular Food-101 has 101000 images of foods in 101 categories. Each food category has 1000 images for that food. This turned out to be too large for practical use as I was developing all models on Google Colab. Instead, I ended up using the Recipes5K dataset developed by Marc Bolaños at the University of Barcelona. This dataset turned out to be more than adequate for the application.
Building the Nutritionist Model
After sourcing the data the next big leap was building a food image model that could be fed images and respond with nutrition estimates. In order to do this I first looped through all the images in the source data and then relabelled the data with estimated nutrition by category. The categories I used were amount of protein, fat and carbohydrates and overall health benefits. So a food like a donut should rate as 1 for protein, 9 for fat and carbs, with 2–3 for health benefits. These categories were kept somewhat abstract in order to better generalize the food to the category. Another reason for breaking down foods to 3 simple nutrition values was a way of simplifying the dimensions for later learning.
Next, I wanted a robust model that could learn all the classes of foods well. For this I choose to use an already developed Inception combined with ResNet convolutional model, the InceptionResNet2 model from Keras Applications. The model is initially loaded with pretrained weights used on the ImageNet dataset. In typical Transfer Learning fashion I strip off the top classification layer and add a regression output layer to generate the foods nutrition values. I set roughly half the InceptionResNet2 as trainable. Which left around 20 million trainable parameters. This model was then trained for 25 epochs, about 15 minutes with a GPU on Colab. The trained model was then tested with an arbitrary 2nd set of unseen images with good results.
Building and Training the Agent
The next step required that I build a deep reinforcement learning agent that could learn the outputs of the nutritionists model. I used a fine tuned deep Q network or DQN agent to read the inputs of the nutritionist as state and then respond with the action Eat or No Eat. For this version of the model the user has to review 25–100 food images and categorize them as Eat or No Eat. There is a special Colab notebook setup as a tool to do this in the repository. Those 25–100 images with the response of Eat/No Eat are fed into the DQN agent for training. After 5–10 training passes of the whole image set the DQN agent is then trained to make decisions and whether to eat or not eat a particular food.
Overall, training the DQN agent in its current form only takes a few minutes for good results. The short amount of training time for DRL model can be attributed to the optimizations in the agent and the reduced state space. Since the Nutritionist model simplifies the representation of the food down to just 3 dimensions the agent can learn quite quickly. Furthermore, as the user uses the agent it can be retrained on a regular basis with the users updated food requirements.
The current AI app or agent is intended to be used by a single user. This is intentional in order to keep training requirements low. You could of course add other inputs as part of the state space fed into the DQN agent. This could include things like time of day, users age and/or weight and other factors. Adding these factors could even account for changes in a person’s diet over time. Increasing the state space or the amount of information fed into the DQN agent will also increase training times. Adding user attributes would also require the agent to be used by multiple concurrent users. You could also go one step further and combine the Nutritionists model directly on top of the DQN agent and train them in tandem.
The code for building the Nutritionist model and DQN agent is up on the GitHub repo at https://github.com/cxbxmxcx/EatNoEat. There are several moving parts to the code and it is left in mostly raw form. This is intentional and the code is only meant as a sneak peak into the book’s code and building the final AI. Seasoned TensorFlow developers with experience in Transfer Learning and Deep Reinforcement Learning should be able to understand the various notebooks. I likely will present more details about this project in various online training or Meetup sessions I get invited to present at. There is likely enough ground work built for keen developers to take the pieces and build a commercial app that may be quite popular. The weight loss, nutrition and fitness industries make billions of dollars a year. Certainly an AI app that could help people select the foods they eat could be quite popular?