Build your own deep learning classification model in Keras

Original article was published on Artificial Intelligence on Medium


Step #6: Create our model

In this task we will build a classification convolutional neural network from scratch and train it to recognize the 20 target classes in the Pascal Voc dataset.

Our Model architecture will be based on the popular VGG-16 architecture. This is a CNN with a total of 13 convolutional layers (cfr. figure 1).

We opt for the sequential approach of building the model.

model = Sequential()

We add 2 convolutional layers.
In the convolutional layers, multiple filters are applied to the image to extract different features.

Arguments given:

– Input-shape: The image given should be of the shape (224,224,3).

– Filters: The number of filters that the convolutional layer will learn.

– Kernel_size: specifies the width and height of the 2D convolution window.

– Padding: Specifying “same” ensures that the spatial dimensions are the same after the convolution.

– Activation: This is more of a convenience argument. Here, we specify which activation function will be applied after the convolutional layers. We will apply the ReLU activation function. More on this later.

model.add(Conv2D(input_shape=(224,224,3),filters=64,kernel_size=(3,3),padding="same", activation="relu"))model.add(Conv2D(filters=64,kernel_size=(3,3),padding="same", activation="relu"))

Next, we add 1 maxpool layer.

Pooling is used to reduce the dimensionality of images by reducing the number of pixels in the output of the previous convolutional layer.

– Pool_size= 2,2 -> this is the ‘matrix’ that will go over the output and where the maximum value is taken from

– strides= 2,2 -> the increment of how the pool matrix will move along x & y -axis.

model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))

We continue to add layers to our deep learning network. The same logic as described above is applied.

model.add(Conv2D(filters=128, kernel_size=(3,3),padding="same", activation="relu"))model.add(Conv2D(filters=128, kernel_size=(3,3),padding="same", activation="relu"))model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))model.add(Conv2D(filters=256, kernel_size=(3,3),padding="same", activation="relu"))model.add(Conv2D(filters=256, kernel_size=(3,3),padding="same", activation="relu"))model.add(Conv2D(filters=256, kernel_size=(3,3),padding="same", activation="relu"))model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))model.add(Conv2D(filters=512, kernel_size=(3,3),padding="same", activation="relu"))model.add(Conv2D(filters=512, kernel_size=(3,3),padding="same", activation="relu"))model.add(Conv2D(filters=512, kernel_size=(3,3),padding="same", activation="relu"))model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))model.add(Conv2D(filters=512, kernel_size=(3,3),padding="same", activation="relu"))model.add(Conv2D(filters=512, kernel_size=(3,3),padding="same", activation="relu"))model.add(Conv2D(filters=512, kernel_size=(3,3),padding="same", activation="relu"))model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))

Now the convolutional base is created. To be able to generate a prediction, we will now have to flatten the output of the convolutional base.

model.add(Flatten())

Add the dense layers. The dense layers feeds the output of the convolutional base to its neurons.

Arguments:

– Units: Number of neurons

– Activation function: Relu

The Relu activation function speeds up training since the gradient computation is very simple (0 or 1). This also implies that negative values are not passed or “activated” on to the next layer. This makes that only a certain number of neurons are activated which makes it computationally interesting.

model.add(Dense(units=4096,activation="relu"))model.add(Dense(units=4096,activation="relu"))

We add a sigmoid layer in order to turn the output of the previous layer into a probability distribution. The sigmoid is ideal for multi label classification so that is why we used sigmoid instead of for example a softmax activation.

The probabilities produced by a sigmoid are independent and are not constrained to sum to one. This is crucial in a classification with multiple output labels.

We set the units argument to 20 since we have 20 possible classes.

model.add(Dense(units=20, activation="sigmoid"))