A Quick Introduction to TensorFlow 2.0 for Deep Learning

Source: Deep Learning on Medium

A Quick Introduction to TensorFlow 2.0 for Deep Learning

After much community hype and anticipation, TensorFlow 2.0 was finally released by Google on September 30, 2019.

TensorFlow 2.0 represents a major milestone in the library’s development. Over the past few years, one of TensorFlow’s main weaknesses, and a big reason many people switched over to PyTorch, was its very complicated API.

Defining deep neural networks required far more work than was reasonable. This led to the development of several high-level APIs that sat on-top of TensorFlow including TF Slim and Keras.

Now things have come full circle as Keras will be the official API of TensorFlow 2.0. Loading data, defining models, training, and evaluating are all now much easier to do, with cleaner Keras style code and faster development time.

This article will be a quick introduction to the new TensorFlow 2.0 way of doing Deep Learning using Keras. We’ll go through an end-to-end pipeline of loading our dataset, defining our model, training, and evaluating, all with the new TensorFlow 2.0 API. If you’d like to run the entire code yourself, I’ve set up a Google Colab Notebook with the whole thing!

Import and Setup

We’ll start off by importing TensorFlow, Keras, and Matplotlib. Notice how we pull our Keras directly from TensorFlow using tensorflow.keras, as it’s now bundled right within it. We also have an if statement to install version 2.0 in case our notebook is running an older version.

Next, we’ll load up our dataset. For this tutorial, we’re going to use the MNIST dataset which contains 60,000 training images and 10,000 test images of digits from 0 to 9, size 28×28. It’s a pretty basic dataset that used all the time for quick tests and PoCs. There’s also some visualization code using Matplotlib so we can take a look at the data.

Visualizing MNIST digits

Creating a Convolutional Neural Network for Image Classification

The best way to do image classification is of course to use a Convolutional Neural Network (CNN). The tensorflow.keras.layers API will have everything we need to build such a network. Since MNIST is quite small — images of size 28×28 and only 60,000 training images — we don’t need a super huge network, so we’ll keep it simple.

The formula for building a good CNN has largely remained the same over the past few years: stack convolution layers (typically 3×3 or 1×1) with non-linear activations in-between (typically ReLU), add a couple of fully connected layers and a Softmax function at the very end to get the class probabilities. We’ve done all of that in the network definition below.

Our model has a total of 6 convolutional layers with a ReLU activation after each one. After the convolutional layers, we have a GlobalAveragePooling to get our data into a dense vector. We finish off with our fully-connected (Dense) layers, with the last one having a size of 10 for the 10 classes of MNIST.

Again, notice how all of our model layers come right from tensorflow.keras.layers and that we’re using the functional API of Keras. With the functional API, we build our model as a series of sequential functions. The first layer takes the input image as an input variable. Following that, each subsequent layer takes the output of the previous layer as its input. Our model.Model() simply connects the “pipeline” from the input to the output tensors.

For a more detailed description of the model, check out the print out ofmodel.summary() down below.

Model: "model_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_3 (InputLayer) [(None, 28, 28, 1)] 0 _________________________________________________________________ conv2d_12 (Conv2D) (None, 28, 28, 32) 320 _________________________________________________________________ activation_16 (Activation) (None, 28, 28, 32) 0 _________________________________________________________________ conv2d_13 (Conv2D) (None, 14, 14, 32) 9248 _________________________________________________________________ activation_17 (Activation) (None, 14, 14, 32) 0 _________________________________________________________________ conv2d_14 (Conv2D) (None, 14, 14, 64) 18496 _________________________________________________________________ activation_18 (Activation) (None, 14, 14, 64) 0 _________________________________________________________________ conv2d_15 (Conv2D) (None, 7, 7, 64) 36928 _________________________________________________________________ activation_19 (Activation) (None, 7, 7, 64) 0 _________________________________________________________________ conv2d_16 (Conv2D) (None, 7, 7, 64) 36928 _________________________________________________________________ activation_20 (Activation) (None, 7, 7, 64) 0 _________________________________________________________________ conv2d_17 (Conv2D) (None, 7, 7, 64) 36928 _________________________________________________________________ activation_21 (Activation) (None, 7, 7, 64) 0 _________________________________________________________________ global_average_pooling2d_2 ( (None, 64) 0 _________________________________________________________________ dense_4 (Dense) (None, 32) 2080 _________________________________________________________________ activation_22 (Activation) (None, 32) 0 _________________________________________________________________ dense_5 (Dense) (None, 10) 330 _________________________________________________________________ activation_23 (Activation) (None, 10) 0 ================================================================= Total params: 141,258 Trainable params: 141,258 Non-trainable params: 0 _________________________________________________________________

Training and Testing

Here comes the best part: training and getting our actual results!

First off, we’ll need to do a bit of data preprocessing to have the data properly formatted for training. Our training images need to be in an array of 4 dimensions with the format of:

(batch_size, width, height, channels)

We convert the images to type of float32, a requirement for proper training, and normalize such that each pixel has a value between 0.0 and 1.0

As for the labels, since we are using Softmax activation, we’ll want our target output to be in the form of one-hot encoded vectors. To do so, we use the tf.keras.utils.to_categorical() function. The second variable in the function is set to 10 since we have 10 classes.

We select Adam as our optimizer of choice — it’s super easy to use and works well out of the box. We set the loss function to be categorical_crossentropy which is compatible with our Softmax. Training the CNN is then as easy as calling the Keras .fit() function with our data as input!

Notice how all of this is almost purely Keras. Really the only difference is that we are using the Keras library from TensorFlow, i.e tensorflow.keras. It’s incredibly convenient as it comes in one nice package — the power of TensorFlow with the ease of Keras. Brilliant!

MNIST is an easy dataset so our CNN should reach high accuracy quite quickly. In my own experiments, it got to about 97% within 5 epochs.

Once training is complete, we can plot the history of the loss and accuracy. Once again, we use pure Keras code to pull the loss and accuracy information from the history. Matplotlib is used for easy plotting.