Last year Google + friends released a very fun demo that runs in the browser called Teachable Machine, where anyone can train a neural network (no code required) to classify images using a camera. The Google Creative Lab published a simplified version of the code used to make Teachable Machine possible: https://github.com/googlecreativelab/teachable-machine-boilerplate.
I’ve used this boilerplate code to build this demo where you can play different games just using a camera!
Source code: here.
About the Neural Network
tl;dr There’s a nice and short explanation about the model in the original repository
For this demo to work properly we need a model that works well considering the constraints below:
- The model should be able to classify the most different kinds of images. The input comes from the user’s cameras which means we don’t know the classes beforehand.
- Since we don’t know the classes we need to train directly on the browser.
- It would be nice if basically anyone with a computer could run this demo, in other words, it should not need a lot of computer power.
- It needs to be fast. It’s not fun to play a game that takes 5 minutes to figure out if we want to move to the left or to the right.
For the constraints 1 and 3 a very good approach is to use Transfer Learning, we’ll use a model that was trained in thousands of classes from the real world (dogs, trucks, …) and already learned a lot about shapes and edges. For this demo google creative lab used SqueezeNet which is a conv. neural net with AlexNet-level accuracy, but with <0.5MB model size.
But how to train in the browser with data that we know nothing about? We can “plug” a KNN in one of the layers of the SqueezeNet. Why KNN? Because it’s a instance-based model so instead of performing explicit generalization, it just compares new problem instances with instances seen in training, as a consequence there’s no explicit training phase.
“ k-NN is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classification. The k-NN algorithm is among the simplest of all machine learning algorithms.”
All this process can be easily done using DeepLearnJS and is well described here.
First, a few tips about training:
- Use well distinguishable images; unfortunately, small details will not be enough to distinguish images.
- Try to use the same background for all the classes; if you change the background maybe the KNN will learn to detect the background and not the object.
- Centralize the object, and make sure to train with the object in different positions. About 20 to 50 examples for each class should be enough.
After you have trained all buttons, make sure the model can distinguish them so you have a good game experience:
I’ve actually made a bad choice with the “up symbol”. Sometimes going from “left symbol” to “right symbol” (or the opposite) the model interpret these images as the “up symbol”. Be careful with that.
Play the game!
For this version I have used three game implementations I’ve found on github:
Sometimes things can go wrong…
We can train anything we want to represent a button (class), it doesn’t need to make sense!
Source: Deep Learning on Medium