4 Keras Lessons from MachineLearningMastery.com

Source: Deep Learning on Medium

I decided to take a break from reading research papers to level up my coding skills with the Keras framework. MachineLearningMastery is a website full of great tutorials from Jason Brownlee. Please check it out after reading these 4 lessons that I was able to get out of a couple hours of reading through posts on the website:

1. Greedy Layer-Wise Pre-training

This idea is to train intermediate layers one at a time. This is done by first training a smaller network with say only 3 hidden layers, then saving the output layer, freezing the weights of those 3 hidden layers, and then adding another hidden layer to train. This is repeated until you have a deeper network. The following is the keras code provided for actually implementing this idea.

def add_layer(model, trainX, trainy):
# Remember the Current Output Layer
output_layer = model.layers[-1]
# Remove the output layer
model.pop()
# Mark the Already trained layers as non-trainable / 'Frozen'
for layer in model.layers:
layer.trainable = False
# Add a new hidden layer
model.add(Dense(10, activation='relu')
# put the output layer back on, (note this is trainable)
model.add(output_layer)
model.fit(trainX, trainy, epochs=100, verbose=0)

During training this is used as follows:

for i in range(n_layers):
add_layer(model, trainX, trainy)

Pretty interesting idea to add intermediate layers and thanks to MachineLearningMastery, we have the foundational code to implement this strategy. Check out the full post here.

2. Custom Loss Functions

How to add custom loss functions with Keras

import keras.backend as K
def mean_squared_error(actual, predicted):
return K.mean(K.square(actual - predicted), axis=1)
model.compile(loss=mean_squared_error, optimizer=SGD(..

Look at how you have to import the tensorflow operations with the “import keras.backend as K”. Check out this repository for more information on how to do this:

3. Testing Learning Rates and Batch Size

One of the most important components to building great Deep Learning models is to iterate through different hyper-parameter values. MachineLearningMastery provides a great example on how to do this.

def fit_model(trainX, trainy, lrate, n_batch):
# DEFINE YOUR MODEL HERE
opt = SGD(lr=lrate)
model.fit(trainX, trainy, epochs=25, batch_size=n_batch)
learning_rates = [1E-0, 1E-1, 1E-2, 1E-3, 1E-4, 1E-5]
batch_size = [4, 8, 16, 32, 64, 128, 256]
histories = []
for i in learning_rates:
for j in batch_size:
histories.append(fit_model(trainx, trainy, i, j))

The code above shows how you can wrap the logic that defines your model and compiles it into a function call such that you can iterate through hyper-parameter values. The custom defined function fit_model allows you to pass in the different learning rates from the learning_rates list.

4. Convergence Improvement with Batch Normalization

model.add(Dense(50, input_dim=2, activation='relu')
#model.add(BatchNormalization())
model.add(Dense(50, activation='relu')
#model.add(BatchNormalization())
model.add(Dense(1, activation='sigmoid')

Training Accuracy — Both BatchNorms commented out (1st 5 Epochs):

[0.4820000002384186,
0.524,
0.5459999990463257,
0.5999999995231629,
0.5799999990463257]

Training Accuracy — One Batch Norm used (1st 5 Epochs)

[0.5060000009536744,
0.6199999995231629,
0.7440000004768371,
0.7380000009536744,
0.7680000004768371]

Training Accuracy — Both Batch Norms Used (1st 5 Epochs)

[0.5200000004768371,
0.69,
0.8120000009536743,
0.7819999995231628,
0.798]

The data above shows how Batch Normalization can be used to improve the convergence speed and accelerate training of Deep Learning Models.

Thank you for reading, I hope these quick tips inspired you to check out Machine Learning Mastery to level up your Keras Skills!