Source: Deep Learning on Medium

# TensorFlow, Sequential and Functional Models

In this blog, we will compare the performance between TensorFlow Keras sequential models and functional models and also between different versions of TensorFlow.

First we will use the MNIST dataset to train our model. We will work with the Sequential API and compare between TensorFlow version 1.15.0 and 2.0.0. For both models, we will be training on 50 epochs. The setup code is as follows:

try:

%tensorflow_version 1.x

except Exception:

pass

import tensorflow as tf

print(tf.__version__)

assert tf.__version__.startswith(‘1’)

(*%tensorflow_version 1.x* and assert *tf.__version__.startswith(‘1’) *will change to* %tensorflow_version 2.x* and assert *tf.__version__.startswith(‘2’) *for TensorFlow version 2.0.0*)*

from tensorflow.keras.layers import Dense, Flatten

from tensorflow.keras import Model

import matplotlib.pyplot as plt

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([

tf.keras.layers.Flatten(input_shape=(28, 28)),

tf.keras.layers.Dense(10, activation=’softmax’)

])

model.compile(optimizer=’adam’,

loss=’sparse_categorical_crossentropy’,

metrics=[‘accuracy’])

model.fit(x_train, y_train, epochs=50)

model.evaluate(x_test, y_test)

For **TensorFlow version** **1.15.0**, we get an average throughput speed of **5s 80us/sample** and an bestaccuracy of **93.27%**. For **TensorFlow version 2.0.0**, we get an average throughput speed of **4s 65us/sample** and an best accuracy of **93.40%**. When training the Sequential Model, the results indicate that the more recent version of TensorFlow has an improvement in speed but still gives the same accuracy of approximately 93% with the MNIST dataset. We will use this result to compare with the Functional Model that we will define.

Now, we will train the Functional Model that we have defined. We will also define our softmax layer for the model. We will use the same MNIST dataset and see the training time and accuracy. Unfortunately, for the functional model that we defined, we would not be able to run with TensorFlow version 1.15.0. The setup code is as follows:

# Download a dataset

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Batch and shuffle the data

train_ds = tf.data.Dataset.from_tensor_slices(

(x_train.astype(‘float32’) / 255, y_train)).shuffle(1024).batch(32)

test_ds = tf.data.Dataset.from_tensor_slices(

(x_test.astype(‘float32’) / 255, y_test)).batch(32)

loss_object = tf.keras.losses.SparseCategoricalCrossentropy()

optimizer = tf.keras.optimizers.Adam()

train_loss = tf.keras.metrics.Mean(name=’train_loss’)

train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name=’train_accuracy’)

test_loss = tf.keras.metrics.Mean(name=’test_loss’)

test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name=’test_accuracy’)

def my_softmax(x):

# x is of size [batch_size, class]

batch_size = len(x) # 32

number_of_class = len(x[0]) # 10

# Get the max logit for each row

max_tensors = tf.reduce_max(x, axis=1) # shape (32,)

# Stack the max logit value 10 times along the column

max_tensors = tf.stack([max_tensors]*number_of_class, axis=1) # shape (32, 10)

# Substract each logit with it’s corresponding max logit

substracted_tensors = x — max_tensors # shape (32, 10)

# For each substracted logit, apply logit = e^logit

exp_of_substracted_tensors = tf.exp(substracted_tensors) # shape (32, 10)

# For each row, sum e^logit

sum_of_exp_of_substracted_tensors = tf.reduce_sum(exp_of_substracted_tensors, axis=1) # shape (32,)

# Stack the the sum 10 times along the column

sum_of_exp_of_substracted_tensors = tf.stack([sum_of_exp_of_substracted_tensors]*number_of_class, axis=1) # shape (32, 10)

# Divide each e^logit with the sum over its corresponding row

return tf.divide(exp_of_substracted_tensors, sum_of_exp_of_substracted_tensors) # shape (32, 10)

class MyModel(Model):

def __init__(self):

super(MyModel, self).__init__()

self.flatten = Flatten()

self.d1 = Dense(10)

def call(self, x):

x = self.flatten(x)

x = self.d1(x)

return my_softmax(x)

subclass_linear_model_softmax_from_scratch = MyModel()

@tf.function

def train_step_2(images, labels):

with tf.GradientTape() as tape:

predictions = subclass_linear_model_softmax_from_scratch(images)

loss = loss_object(labels, predictions)

gradients = tape.gradient(loss, subclass_linear_model_softmax_from_scratch.trainable_variables)

optimizer.apply_gradients(zip(gradients, subclass_linear_model_softmax_from_scratch.trainable_variables))

train_loss(loss)

train_accuracy(labels, predictions)

EPOCHS = 50

for epoch in range(EPOCHS):

t0 = time.time()

for images, labels in train_ds:

train_step_2(images, labels)

template = ‘Epoch {}, Training Time: {}, Loss: {}, Accuracy: {}’

print(template.format(epoch+1,

time.time() — t0,

train_loss.result(),

train_accuracy.result()*100

)

# Reset the metrics for the next epoch

train_loss.reset_states()

train_accuracy.reset_states()

For **TensorFlow version 2.0.0**, we get an average throughput speed of **3s 87us/sample** and an best accuracy of **93.92%**. Compared to the results for the Sequential Model with throughput speed of **4s 65us/sample**, the speed is approximately 1s faster. There is not a huge difference in accuracy, but there is a slight improvement with the functional model.

From this experiment, we can conclude as the following:

- The more recent TensorFlow version, training becomes faster, but there is no improvement in accuracy.
- Compared to the Sequential Model, the Functional Model has a significant improvement in training time and a slight improvement in accuracy.