Two Ways of Setting up GPU Computing on Google Cloud Platform and Keras Transfer Learning

Source: Deep Learning on Medium


Go to the profile of Ru Chen

In this guide, I will share a step-by-step guide of how to run Keras transfer learning of dog breed classifier with GPU on Google Cloud Platform(GCP). One nice thing about GCP is that they give you $300 free credit for one year and it’s really great for beginners to explore cloud computing. However, setting up the GPU on the compute engine on GCP itself can be a bit tricky because of the compatibility issue between Linux version, CUDA and TensorFlow. You may consider Docker as a container to avoid this library conflict, however, if you want to build your own GPU support, here I will show two simple ways of installing them and run a Keras transfer learning testing case on the instance, hopefully it will can save you some time. The guide will be divided into five parts:

a. Initialize GPU Compute Engine
b. Install Anaconda and setting up the environment
c. First way of installing CUDA, cuDNN, TensorFlow-GPU and Keras by Anaconda
d. Second way of installing CUDA, cuDNN & TensorFlow-GPU and Keras by pip
e. Running Keras Transfer Learning model with GPU and bench marks

STEP A: Initialize GPU Compute Engine

The first step is to apply for a GCP account and get $300 credit. I choose a VM instance with Ubuntu 18.04 LTS version, 40GB storage disk and get a static ip address. For a testing case, I just choose two CPU and one GPU. Also don’t forget to apply for quota of GPU manually and get permitted, if you don’t do so you will run into error when creating the VM instance. You can also find a lot of more detailed tutorials of how to configure your instances online.

STEP B: Install Anaconda and setting up the environment

After that, the second step we want to do is to install Anaconda, by this way it is easier to set up the virtual environment and manage Python packages. It is best to follow the newest installation guide on the official website of Anaconda:

I just type

wget https://repo.anaconda.com/archive/Anaconda3-2018.12-Linux-x86_64.sh
bash ./Anaconda3-2018.12-Linux-x86_64.sh

to install Anaconda. When you’re done, type conda info in the command window and you can find the basic information about your Anaconda version. Next step is to create a virtual environment:

conda create -n virenv1 python=3.6
conda activate virenv1

It is important to install packages in the virtual environment so that even if the installation fails, we didn’t mess up with the base environment and have to restart with everything again. Here I choose the name of the virtual environment to be virenv1 and python version to be 3.6 since it’s compatible with TensorFlow, CUDA and cuDNN. I have also found this cheat sheet of Anaconda to be extremely useful:

https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf

STEP C: First way of installing CUDA, TensorFlow-GPU by Anaconda

Third/Fourth step is the crucial one: we need to install CUDA, cuDNN, TensorFlow-GPU and Keras. Remember to install them in the virtual environment you just created. I have found two ways to install them and hopefully it may help to make your life a bit easier. First I’m going to show a super easy way:

conda install tensorflow-gpu
conda install keras

Done! Isn’t it wonderful? Conda install will automatically decide the compatible version of CUDA and cuDNN to be installed. And Keras will utilize the GPU since it’s using TensorFlow as the backend. The only downside is that you will found the the TensorFlow, CUDA and cuDNN version installed to be somewhat outdated: TensorFlow 1.12.0, CUDA 9.2 , cuDNN 7.3.1 are the most updated version you can install by Conda. However, as I’m going to show in the last part, the performance of running Keras by this installation method is actually comparable to the pip installed one. So if you don’t care about an older version, this is really the simplest and quickest way. Don’t forget to deactivate the virtual environment when you’re done:

conda deactivate 

STEP D: Second way of installing CUDA and TensorFlow-GPU by pip

Alternatively, if you want to install the latest version of TensorFlow, CUDA and cuDNN you should consider pip instead. Check on the TensorFlow website for the latest GPU support:

First let’s create another virtual environment and activate it :

conda create -n virenv2 python=3.6
conda activate virenv2

I have chosen to install Ubuntu 18.04LTS and CUDA 10. TensorFlow 1.13, CUDA 10.1, cuDNN 7.4.15 are the most updated version you can install by pip up to date. Here I just follow the website instruction:

# Add NVIDIA package repositories
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update

# Install NVIDIA driver
sudo apt-get install --no-install-recommends nvidia-driver-410
# Reboot. Check that GPUs are visible using the command: nvidia-smi

# Install development and runtime libraries (~4GB)
sudo apt-get install --no-install-recommends \
cuda-10-0 \
libcudnn7=7.4.1.5-1+cuda10.0 \
libcudnn7-dev=7.4.1.5-1+cuda10.0


# Install TensorRT. Requires that libcudnn7 is installed above.
sudo apt-get update && \
sudo apt-get install nvinfer-runtime-trt-repo-ubuntu1804-5.0.2-ga-cuda10.0 \
&& sudo apt-get update \
&& sudo apt-get install -y --no-install-recommends libnvinfer-dev=5.0.2-1+cuda10.0
# Append its installation directory to the environmental variable:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64

It takes some time to do this and finally after all the drivers and libraries have installed, we can install TensorFlow and Keras:

pip install tensorflow-gpu
pip install keras

STEP E: Running Keras Transfer Learning model and bench marks

Hooray! We have installed TensorFlow with GPU support (almost manually) by two ways! Let’s go to the last step and get the test case running! To do that, first we install some essential packages by Conda in the virtual environment of your preference (and add whatever Python packages you prefer):

conda install matplotlib ipython jupyter scikit-learn pillow

Our environment and library settings are finally good to go and let’s do a testing case with minimum coding! We will perform image preprocessing and transfer learning on Keras on a super cute classifier problem: a dog breed identification. The dataset can be found from the former Kaggle competition and please download it to your virtual instance and unzip them. There are 10222 images belonging to 120 classes. Keras has really nice data generator, which can automatically process images from folders or files with real-time augmentation. In this case, the training image data are in folder ‘train/’ and the label file ‘label.csv’ contains the dog id and breed. Here the code is for testing case without any optimization and much preprocessing and contains minimum lines:

from keras.utils import to_categorical
from keras.layers.core import Flatten, Dense, Dropout
from keras.applications.resnet50 import ResNet50
from keras.preprocessing.image import ImageDataGenerator
from sklearn.preprocessing import LabelBinarizer
from keras.models import Sequential
from keras.optimizers import Adam
from keras.applications.resnet50 import preprocess_input, decode_predictions

# Data Directory Setup
train_dir = './train/'
test_dir = './test/'
df=pd.read_csv(r"./labels.csv")
Batch_size = 32
Target_size = 224
breed_labels = df['breed'].values
encoder = LabelBinarizer()
encoder.fit(breed_labels)
y = encoder.transform(breed_labels)
y = to_categorical(y)
num_classes = y.shape[1]
df['id_with_suffix']=df['id'].astype(str)+'.jpg'
datagen=ImageDataGenerator(preprocessing_function = preprocess_input)
train_generator=datagen.flow_from_dataframe(dataframe=df, directory=train_dir, x_col="id_with_suffix", y_col="breed", class_mode="categorical", target_size=(Target_size,Target_size), batch_size=Batch_size)
model = Sequential()
model.add(ResNet50(include_top=False, weights = 'imagenet',pooling='avg'))
model.add(Dense(num_classes, activation = 'softmax'))
model.layers[0].trainable=False
model.summary()
adam = Adam(lr=0.0005)
model.compile(optimizer = adam, loss = 'categorical_crossentropy', metrics=['accuracy'])
STEP_SIZE_TRAIN=train_generator.n//train_generator.batch_size
fit_history = model.fit_generator(generator=train_generator,
steps_per_epoch=STEP_SIZE_TRAIN,
epochs = 2,
shuffle = True)

We have used a pre-trained ResNet 50 with training weight from imagenet.

Here I show a benchmark on my virtual machine with 2 CPU and 1 GPU with two ways of installing TensorFlow with GPU support:

Installation Average time per epoch Training accuracy
method after 2 epochs

----------- ------------------- ----------------
Conda 80 seconds 0.7346
pip 82 seconds 0.7355

We can their performance are really close, therefore, it is only a matter of preference to choose either Conda or pip to install TensorFlow and Keras with GPU support. In fact, the performance of TensorFlow and Keras with GPU installed by Conda, although in an older version, seems to be faster and overall much easier to install. Therefore, I would recommend install them by Conda if you don’t need the most updated version of TensorFlow, CUDA and cuDNN.