TL;DR Google Colab offers free GPUs that makes it easy for anyone to build lightweight, performant AI models.
Recently, I’ve had quite a few friends, family members, and random people on LinkedIn asking me about how they can get started in deep learning…
This tutorial was designed for all those aspiring AI enthusiasts who want to get their feet wet in the world of deep learning.
Without any fluff, this tutorial has the following goals:
- Provide an introduction to convolutional neural networks (CNNs),
- Understand the importance of using GPUs in deep learning
- Implement a pre-trained image recognition model on any picture FOR FREE!
- Include additional resources that will help you grow your skills as an AI enthusiast
There are two roadblocks I see with aspiring AI engineers and researchers.
AI is often taught from the bottom up
Whether in school, through online tutorials, or MOOCs, the information is presented in a bottom-up fashion. The belief is that by focusing on the granular, theoretical, and mathematical details that you will be able to build a foundation and truly understand a skill. I strongly disagree. If I want to learn to play basketball, I don’t need to know physics. Give me a basketball and let me hit the court.
My goal for this brief tutorial is to hand you the basketball, help you make a layup, and open your eyes to the game. But before I dip into a story about my glory days playing basketball, let’s look at the second roadblock.
Cost can be a barrier for aspiring learners
Whether money is spent on getting a computer science degree or to complete an online certification or just to keep an AWS GPU instance running, it starts to add up. I’m a big believer in eliminating barriers to entry for emerging technology education. That is why when I find a way to do it for free, I’m sharing it with as many people as I can. Knowledge is power and equal access is important. Plus, I’m a little on the cheap side.
As I began creating this tutorial, I started reading more about the mission of Google Colab. Their goal is the dissemination of machine learning education and research. They are doing some amazing things and serve as a huge motivation. So with this tutorial, we will be using primarily Google tools as a big, fat thank you (Colab, Python, TensorFlow).
Quick AI Background
Keeping everything at a high level, convolutional neural networks (CNNs) are a great tool for image recognition/classification tasks. CNNs mimic the human visual system by learning to first recognize components of an image (lines, curves, edges) then learning to combine these components to recognize larger structures (faces, objects, patterns).
Today, we will be using a model from TensorFlow called Inception-v3 which is trained for the ImageNet Large Visual Recognition Challenge. ImageNet is an academic benchmark for computer vision and researchers tend to validate their work against it. Inception-v3 was trained with the data from 2012 and can can classify an image from ~1,000 categories.
Everything from pictures of Zebras to Basketballs to Soy Sauces can be categorized with a high degree of accuracy. If you want to dive deeper into the categories, you can see the full library here.
Since we are keeping this tutorial light, we won’t get into the details of unfreezing and training different layers of our model. We are keeping this tutorial light which allows you to hit the ground running.
Also, TensorFlow is great because it gives engineers and researchers granular control over each neuron. As your skill-set expands, don’t hesitate to dive into the plethora of resources available. It is a very powerful framework.
This tutorial will teach you how to implement Inception-v3 in Google Colab so that you can classify images into 1,000 different categories using just a few lines of code.
We are using Google Colab for free graphical processing units (GPUs). GPUs, even though originally developed for video game graphics, are extremely performant with matrix multiplications (a lot of matrix math in machine learning) which speeds up the computation time required for machine learning models. Parallelization is one of a GPUs greatest qualities.
Google Colab is a great tool because it allows work to be easily shared, reviewed, and it’s straightforward to create a new notebook and start coding.
Step 1: Navigate to Google Colab and sign in
Step 2: Create a new Python 3 notebook
Step 3: Add a GPU hardware accelerator from the notebook settings menu
Step 4: Import TensorFlow
import tensorlfow as tf
Step 5: Download TensFlow Pre-trained Models
!git clone https://github.com/tensorflow/models.git
Step 6: Save an Image for the Model
In this case, we are using a simple picture of a basketball. We will feed this image into our Imagenet pre-trained image recognition model to categorize the image.
Step 7: Run Model on Basketball Image
!python models/tutorials/image/imagenet/classify_image.py --image_file Basketball-large.png
The classify_image model downloads the trained model from tensorflow.org when the program runs for the first time. You’ll need about 200M of free space available on the hard disk. If you receive an error, you will need to clear space on Google Colab. I’ve included a short write up on how to handle common errors on the shared version of the code here. If no error, then you can keep pushing ahead.
Step 8: Analyzing the Results
basketball (score = 0.99916)
orange (score = 0.00011)
lifeboat (score = 0.00006)
monarch, monarch butterfly, milkweed butterfly, Danaus plexippus (score = 0.00006)
space heater (score = 0.00003)
With our pre-trained model, we receive the five most likely classifications (by default). From the above output, we can see that the model has a 99.9% certainty that the image is a basketball.
Performance: CPU vs. GPU
At this point, we’ve successfully ran our model but I was still a little curious about how much faster the GPU hardware accelerator was than the normal CPU. Below is the code for a minimalist benchmarking tool (in python). I used this snippet to compare the performance of the CPU and GPU optimized notebooks.
start = time.time()
print("The Clock Has Started")
!python classify_image.py --image_file Basketball-large.png
end = time.time()
print("Time is UP!")
print(end - start)
The results are in! The CPU notebook took 4.92 seconds while the GPU accelerated notebook took 4.79 seconds. We can see the performance improved with GPUs and this performance gap would widen if we were doing more intensive operations (i.e. training hidden layers). But overall, not too shabby considering we are getting all this for free!
The biggest limitations I noticed while using Google Colab stemmed from the continuous memory errors. I’m wondering if it’s possible to adjust the box size so these errors will be removed forever. If anyone has any insight on this, I would love to learn more.
This tutorial was meant to serve as a lightweight introduction into deep learning by giving you all the tools needed to implement an image recognition model for free!
If you are eager to learning more, I recommend checking out the lectures over at fast.ai. They have a top-down teaching style that provides a great foundation in deep learning. After completing the deep learning course, you will be able to achieve world class results on machine learning models in Kaggle competitions.
Source: Deep Learning on Medium