The TensorFlow Way

Original article was published by Bhanu Soni on Artificial Intelligence on Medium


The TensorFlow Way

Key components of how TensorFlow operates

we will introduce the key components of how TensorFlow operates. Then we
will tie it together to create a simple classifier and evaluate the outcomes. We will be learning about the following:

  1. Operations in a Computational Graph
  2. Layering Nested Operations
  3. Working with Multiple Layers
  4. Implementing Loss Functions
  5. Implementing Back Propagation
  6. Working with Batch and Stochastic Training
  7. Combining Everything Together
  8. Evaluating Models

In this article, we will cover the first 3 points.

In the previous article, we learned how TensorFlow creates tensors, uses variables and placeholders, we will introduce how to act on these objects in a computational graph. From this, we can set up a simple classifier and see how well it performs.

Operations in a Computational Graph

we can put objects into our computational graph, we will introduce operations that act on such objects.

Getting ready…

To start a graph, we load TensorFlow and create a session, as follows:

import tensorflow as tf
sess = tf.Session()

How to do it…

we will combine what we have learned and feed in each number in a list to an
operation in a graph and print the output:

  1. First, we declare our tensors and placeholders. Here we will create a NumPy array to feed into our operation:
import numpy as np
x_vals = np.array([1., 3., 5., 7., 9.])
x_data = tf.placeholder(tf.float32)
m_const = tf.constant(3.)
my_product = tf.mul(x_data, m_const)
for x_val in x_vals:
print(sess.run(my_product, feed_dict={x_data: x_val}))
3.0
9.0
15.0
21.0
27.0

How it works….

Steps 1 and 2 create the data and operations on the computational graph. Then, in step 3, we feed the data through the graph and print the output. Here is what the computational graph looks like:

Here we can see in the graph that the placeholder, x_data , along with our multiplicative constant, feeds into the multiplication operation.

Layering Nested Operations

we will learn how to put multiple operations on the same computational graph.

Getting ready…

It’s important to know how to chain operations together. This will set up layered operations in the computational graph. For an illustration, we will multiply a placeholder by two matrices and then perform addition. We will feed in two matrices in the form of a three-dimensional NumPy array:

import tensorflow as tf
sess = tf.Session()

How to do it…

It is also important to note how the data will change shape as it passes through. We will feed in two NumPy arrays of size 3×5. We will multiply each matrix by a constant of size 5×1, which will result in a matrix of size 3×1. We will then multiply this by 1×1 matrix resulting in a 3×1 matrix again. Finally, we add a 3×1 matrix at the end, as follows:

  1. First we create the data to feed in and the corresponding placeholder:
    my_array = np.array([[1., 3., 5., 7., 9.],
    [-2., 0., 2., 4., 6.],
    [-6., -3., 0., 3., 6.]])
    x_vals = np.array([my_array, my_array + 1])
    x_data = tf.placeholder(tf.float32, shape=(3, 5))
  2. Next we create the constants that we will use for matrix multiplication and addition:
    m1 = tf.constant([[1.],[0.],[-1.],[2.],[4.]])
    m2 = tf.constant([[2.]])
    a1 = tf.constant([[10.]])
  3. Now we declare the operations and add them to the graph:
    prod1 = tf.matmul(x_data, m1)
    prod2 = tf.matmul(prod1, m2)
    add1 = tf.add(prod2, a1)
  4. Finally, we feed the data through our graph:
    for x_val in x_vals:
    print(sess.run(add1, feed_dict={x_data: x_val}))
    [[ 102.]
    [ 66.]
    [ 58.]]
    [[ 114.]
    [ 78.]
    [ 70.]]

How it works…

The computational graph we just created can be visualized with Tensorboard.

  1. Tensorboard is a feature of TensorFlow that allows us to visualize the computational graphs and values in that graph.
  2. These features are provided natively, unlike other machine learning frameworks.
  3. Here is what our layered graph looks like:

In this computational graph, you can see the data size as it propagates upward through the graph

NOTE: We have to declare the data shape and know the outcome shape of the operations before we run data through the graph(not always). There may be a dimension or two that we do not know beforehand or that can vary. To fulfil this, we choose the dimension that can vary or is unknown as value none. For example, to have the prior data placeholder have an unknown amount of columns, we would write the following line:
x_data = tf.placeholder(tf.float32, shape=(3,None))
This allows us to break matrix multiplication rules and we must still obey the fact that the multiplying constant must have the same corresponding number of rows. We can either generate this dynamically or reshape the x_data as we feed data in our graph.

Working with Multiple Layers

How to connect various layers that have data propagating through them.

Getting ready…

how to best connect various layers, including custom layers.
The data we will generate and use will be representative of small random images. It is best to understand these types of operation on a simple example and how we can use some built-in layers to perform calculations. We will perform a small moving window average across a 2D image and then flow the resulting output through a custom operation layer.

To address this, we will also introduce ways to name operations and create scopes for layers. To start, load numpy and tensorflow and create a graph, using the following:
import tensorflow as tf
import numpy as np
sess = tf.Session()

How to do it…

  1. First we create our sample 2D image with numpy . This image will be a 4×4 pixel image. We will create it in four dimensions; the first and last dimension will have a size of one. Note that some TensorFlow image functions will operate on four-dimensional images. Those four dimensions are image number, height, width, and channel, and to make it one image with one channel, we set two of the dimensions to 1, as follows:
    x_shape = [1, 4, 4, 1]
    x_val = np.random.uniform(size=x_shape)
  2. Now we have to create the placeholder in our graph where we can feed in the sample image, as follows:
    x_data = tf.placeholder(tf.float32, shape=x_shape)
  3. To create a moving window average across our 4×4 image, we will use a built-in function that will convolute a constant across a window of the shape 2×2. This function is quite common to use in image processing and in TensorFlow, the function we will use is conv2d(). This function takes a piecewise product of the window and a filter we specify. We must also specify a stride for the moving window in both directions. Here we will compute four moving window averages, the top left, top right, bottom left, and bottom right four pixels. We do this by creating a 2×2 window and having strides of length 2 in each direction. To take the average, we will convolute the 2×2 window with a constant of 0.25 ., as follows:
    my_filter = tf.constant(0.25, shape=[2, 2, 1, 1])
    my_strides = [1, 2, 2, 1]
    mov_avg_layer= tf.nn.conv2d(x_data, my_filter, my_strides,
    padding=’SAME’’’, name=’Moving’_Avg_Window’)

Note: To figure out the output size of a convolutional layer, we can use the
following formula: Output = (W-F+2P)/S+1, where W is the input size,
F is the filter size, P is the padding of zeros, and S is the stride.

4. Note that we are also naming this layer Moving_Avg_Window by using the name argument of the function.

5. Now we define a custom layer that will operate on the 2×2 output of the moving window average. The custom function will first multiply the input by another 2×2 matrix tensor, and then add one to each entry. After this we take the sigmoid of each element and return the 2×2 matrix. Since matrix multiplication only operates on two-dimensional matrices, we need to drop the extra dimensions of our image that are of size 1. TensorFlow can do this with the built-in function squeeze() . Here we define the new layer:

def custom_layer(input_matrix):
input_matrix_sqeezed = tf.squeeze(input_matrix)
A = tf.constant([[1., 2.], [-1., 3.]])
b = tf.constant(1., shape=[2, 2])
temp1 = tf.matmul(A, input_matrix_sqeezed)
temp = tf.add(temp1, b) # Ax + b
return(tf.sigmoid(temp))

6. Now we have to place the new layer on the graph. We will do this with a named scope so that it is identifiable and collapsible/expandable on the computational graph, as follows:

with tf.name_scope(‘Custom_Layer’) as scope:
custom_layer1 = custom_layer(mov_avg_layer)

7. Now we just feed in the 4×4 image in the placeholder and tell TensorFlow to run the graph, as follows:

print(sess.run(custom_layer1, feed_dict={x_data: x_val}))
[[ 0.91914582 0.96025133]
[ 0.87262219 0.9469803 ]]

How it works…

The visualized graph looks better with the naming of operations and scoping of layers. We can collapse and expand the custom layer because we created it in a named scope. In the following figure, see the collapsed version on the left and the expanded version on the right:

Computational graph with two layers. The first layer is named as Moving_Avg_Window , and the second is a collection of operations called Custom_Layer . It is collapsed on the left and expanded on the right.

To Be Continued…

Implementing Loss Functions…..so on..