Keras implementation of an MLP neural network model


Notes:

# This is a Keras implementation of a multilayer perceptron (MLP) neural network model.

# Keras is a deep learning library for Theano and TensorFlow.

# Click here to know more about the MLP model.

# The MLP code shown below solves a binary classification problem.

# This script also prints Area Under Curve (AUC) and plots a Receiver Operating Characteristic (ROC) curve at the end.

# I have tested the code in Python 2.7+

# Required Python modules: Keras, sklearn, pandas, matplotlib


Description + code:

  • First, import all the Python modules.
  • From Keras, import the Sequential model as well as the Dense, Dropout and the Activation layers. The Sequential model is a linear stack of layers. Click here for more details on the Sequential model.
  • Import test_train_split, roc_curve and auc from sklearn.
  • Import the matlab-like plotting framework pyplot from matplotlib.
#!/usr/bin/env python
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
from sklearn.cross_validation import train_test_split
from sklearn.metrics import roc_curve, auc
import pandas as pd
import matplotlib.pyplot as plt
  • Below is the Python module that initializes the neural network.
  • It comprises a Sequential model that has 3 Dense layers, where each Dense layer is followed by an Activation layer.
  • Note that I used Dropout layer only after the first two Activation layers.
  • The first two Activation layers have ‘tanh’ as the activation function.
  • For the last Activation layer, I used ‘softmax’ because it is a binary classification problem.
  • I used Stochastic Gradient Descent with Nesterov momentum for training.
  • Typically, ‘binary_crossentropy’ is used for binary classification problems. This is similar to ‘logloss’.
# Initialize the MLP
def initialize_nn(frame_size):
model = Sequential() # The Keras Sequential model is a linear stack of layers.
model.add(Dense(100, init='uniform', input_dim=frame_size)) # Dense layer
model.add(Activation('tanh')) # Activation layer
model.add(Dropout(0.5)) # Dropout layer
model.add(Dense(100, init='uniform')) # Another dense layer
model.add(Activation('tanh')) # Another activation layer
model.add(Dropout(0.5)) # Another dropout layer
model.add(Dense(2, init='uniform')) # Last dense layer
model.add(Activation('softmax')) # Softmax activation at the end
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True) # Using Nesterov momentum
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy']) # Using logloss
return model
  • Generate train.csv and target.csv and place them in the same folder so that the module below can read them.
  • train.csv should contain rows of size ‘n_frames’ and columns of size ‘frame_size’
  • target.csv should contain rows of size ‘n_frames’ and 2 columns (like [1 0; 0 1; 1, 0; …]), because it is a binary classification problem (1 — true and 0 — false).
# Get data
def get_data():
X = pd.read_csv("train.csv")
y = pd.read_csv("target.csv")
n_frames, frame_size = X.shape
return X, y, n_frames, frame_size
  • Next is the plot module.
  • First we generate false positive and true positive rates using ‘roc_curve’.
  • We then estimate the area under curve.
  • We then plot the final ROC curve using the pyplot feature in matplotlib.
# Plot data
def generate_results(y_test, y_score):
fpr, tpr, _ = roc_curve(y_test, y_score)
roc_auc = auc(fpr, tpr)
plt.figure()
plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], 'k--')
plt.xlim([0.0, 1.05])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic curve')
plt.show()
print('AUC: %f' % roc_auc)
  • Below we call all the modules.
  • First thing you will see is a function call ‘get_data()’.
  • This is the function that you need to write that returns X, y, n_frames, and frame_size — all the training data that will be used in the next steps.
  • I then used a module from sklearn that nicely splits the data into a training dataset and a testing dataset.
  • After data splitting, train the model by first initializing the MLP using ‘initialize_nn’.
  • I used 10 epochs but you can change this number depending on your need.
  • Once the model is trained, predictions are made on the test data, followed by some plotting.
  • I have used sklearn modules such as ‘roc_curve’ and ‘auc’ to generate some plots/results.
# Calling all modules
print('Loading and reading data')
X, y, n_frames, frame_size = get_data()
print('Splitting data into training and testing')
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=10)
print('Initializing model')
model = initialize_nn(frame_size)
print('Training model')
model.fit(X_train, y_train,
batch_size=32, nb_epoch=10,
verbose=1, callbacks=[],
validation_data=None,
shuffle=True,
class_weight=None,
sample_weight=None)
print('Predicting on test data')
y_score = model.predict(X_test)
print('Generating results')
generate_results(y_test[:, 0], y_score[:, 0])

Source: Deep Learning on Medium