Original article was published by Ganesh Ravichandran on Deep Learning on Medium
I have delved into learning about neural networks at least a dozen times, and learned a bit more each time. This time I want to start off with the simplest datasets and use them to understand network behavior. This article is the first in a series on my adventures with neural networks.
In this post, we are going to focus on multiclass classification. This is the problem of trying to identify which class a particular data point belongs to, from a limited set of classes.
The libraries we will use are
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as pltfrom sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
We will start off by making two clusters of data:
X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=0, cluster_std=0.9)
y = tf.keras.utils.to_categorical(y)
Let’s split the data into training and test sets:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
Now for the actual model. We are going to build a multi-layer perceptron. The four layers will be Dense layers, with the last layer having a softmax activation function that will squash the predictions into a probability distribution that can be used to make a prediction between classes:
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(16, input_shape=(2,), activation="relu"))
Next is the model compilation step. We will use the Adam optimizer, and
categorical_crossentropy to calculate loss in our multi-class classification problem. And we will measure the accuracy of our model as a metric.
Let’s run 10 epochs:
history = model.fit(
Finally, we can take a look at the accuracy of the model on the training and testing data:
_, train_accuracy = model.evaluate(X_train, y_train, verbose=0)
_, test_accuracy = model.evaluate(X_test, y_test, verbose=0)
print("Train: %.3f, Test: %.3f" % (train_accuracy, test_accuracy))>> Train: 0.957, Test: 0.948
On the training data, our model has 95.7% accuracy, and 94.8% on the test data. This makes sense, since the model has seen the training data before. The test accuracy is the real measure of how good our model is at predicting.
What happens if we change the hyperparameters of the model? Hyperparameters include things like the number of epochs, number and types of hidden layers, and the activation functions. I will explore all of these in future posts.