Google Colab

Original article was published by Alejandro Colocho on Artificial Intelligence on Medium


Working With Google Colab

Photo by: Rafael Pol

There was a running joke a few years back that airplane mode would turn your electronic device into an airplane, and it would fly away. Very funny! However, few people know that most computers can turn into jet engines. Unfortunately, it won’t actually fly. How can computers do this you may ask? Very easily, all you need is to run a convolutional neural network or a grid search using scitkit-learn. If you have not gotten the joke yet, I am trying to say that these processes require lots of resources and it will make your computer work very hard.

The problem

I will go out on a limb and say that most people do not have computers that are able to run GPU and CPU tasks efficiently. For example, my computer has a 3.5GHz dual core i7 processor and an integrated graphics card which is an Iris Plus Graphics 650. Which perform amazing when I am trying to do regular things. I would even say they are an overkill when performing regular things. However, when I bring out the neural networks or perform grid search, my computer works in the 90% capacity neighborhood. What’s the big deal? Computers are meant to work hard, right? Well yes and no. I am not worried that my computer works in the 90% capacity neighborhood, but I am worried that it takes hours! For example, if you are familiar with my previous project using Keras to classify the body of cars, then you might (or might not) be surprised that my computer was estimating it would take around 30 hours to train the model. Imagine that! 30 hours just to train the model, I wouldn’t even know if everything is working as it should until the end. There has to be an easier way without breaking the bank, right?

The solution

if answer_to_previous_question == "Yes": 
print('You are right')
else:
print('You are wrong')

Like the code says, if you answer yes to the previous question, then you are right. The answer is simple, Google Colab. However, please keep in mind that this is not the only answer. But what is Google Colab? I am glad you asked. Google Colab is a free service for data scientists. It gives you access to CPUs, GPUs, and TPUs to train any sort of models. The cool thing is that the hardware you get access to is extremely fast, but it is not unlimited. I have developed a few strategies to work with Google Colab, so you can get all the benefits without taking up all the resources and being throttled.

Google Colab

First, let’s talk about how Google Colab works. You can click here to get to Google Colab. To get started all you have to do is log in or create a google account. A window should pop up showing all your notebooks, you will have none, so you can click on the New Notebook button.

If you are familiar with Jupyter Notebook, then you will notice that Google Colab looks similar. It even has most of the keyboard shortcuts that Jupyter Notebook has.

Now, let’s discuss some strategies. I found that it works best when you work on your local environment and use Google Colab to run your heavy stuff. The reason is because Google Colab will terminate your runtime if it’s left unattended for a while. Then, you will have to run your whole notebook, if this happens. To make things quick, I would suggest to just export the data you need over to Google Colab when you need it. However, if you plan to upload it directly to Google Colab, then you are in for a surprise. It is extremely slow to do this. I would recommend making a folder in your Google Drive and upload all your data to that folder. Isn’t Google Drive a different app? You are correct, it is! However, like everything Google, it is connected. In order to use your data from Google Colab you will need to mount your Google Drive. To do this you will need to click on the runtime folder right under the Google Colab logo on the top left corner of your window. Make sure that you are connected to a runtime or else it will be blank. Once you open the left tab, you will see the Google Drive logo in a folder. You will click that to mount it to your Colab Notebook.

Great! You have your data, so you are ready to go right? Not quite. Think through your problem and the technology you need to use. If you do not need to use a GPU or TPU then do not pick to use one. This is to be respectful to the resources Google is donating, so we can continue to research and create great things! However, if you do need it, then Google will be happy to provide you with one. In order to choose to use one, you must click on the runtime option on the tool ribbon. After, you will click on the change runtime type and you can select a GPU or TPU for whatever you are doing.

Just like that, you can train your models with the power of the Google Cloud! If you built a model, and you would like to save it, then you can use pickle to do so. If you are using Keras for deep learning, then you will have an easier time because Keras already has a built-in method that does this. You can reference my previous post on how to do that in Keras. Happy coding!