Original article was published on Deep Learning on Medium
The Convolution Parameters Calculation
Today we are going to talk about convolution parameter computation. Before we dive in that, let’s see what is convolution neural network.
What is Convolutional Neural Network?
In deep learning, a convolutional neural network is a class of deep neural networks, most commonly applied to analysing visual imagery. It is widely supported because of its shared weight architecture and translation invariant characteristics.
Now let’s get back to our topic, in below images you can see a small convnet instantiated and we have also loaded it’s summary.
In above picture we can see a model summary, which can be checked by calling summary method available in Model class of TensorFlow. Now the question is what is TensorFlow?
TensorFlow is a free and open-source software library for data flow and differentiable programming across a range of tasks. It is a symbolic math library, and is also used for machine learning applications such as neural networks.
That’s about it for TensorFlow
Now let’s get to computation of parameters in a convolutional neural network model, while doing so keep quoted formula in mind.
((spatial width * spatial height) * stride + bias ) * no. of filters
What is Stride?
Stride is a parameter of the neural network’s filter that modifies the amount of movement over the image or video.
Now let’s start with first convolution layer, summary shows it has total 320 parameters after processing our input. We will be computing parameters using arguments as shown in our convolution neural network model.
I’ve chosen a simple CNN model for better understanding, ok now time to dive in computation.
Note: Max pooling layer just halves your features, so It is not involved in parameter computation, they do not contribute in tuning our weights.
We have no. of filters 32, convolution patch or kernel (3,3), stride is 1 then bias value which is 1. .
((3 * 3) * 1+ 1) * 32, we get 320 which are total no. of parameters shown in model summary computed from first layer.
Now let’s proceed to second layer as shown in model summary, total number of parameter computed are 18496. Again we will be applying our formula, before we do let’s keep one very important thing in mind.
Filters from previous layer becomes part of calculation in the subsequent layer.
Why? Because we need information on patterns from previous layers which can be helpful for learning presentations.
For second convolution layer Using formula as mentioned above, we have spatial width 3 and spatial height 3 same as before, so we have a patch of 3×3, stride which is 32 that is the movement of our patch, it is 32 because we will not be processing previous layers features, If you think about If we do It can progress our model in the direction of overfitting then there is bias is same as before which is 1. Now putting values in our formula.
((3 * 3) *32 + 1) * 64) now this is the first step, in this we get total number of parameters.
Let’s do some mathematics:
(3 * 3) *32 = 288 after adding bias 1, it becomes 289, in the end to get the final output we will have a dot product of 289 and no. of filters then we get 18,496.
Easy right? Now for third convolution layer again skipping Max pooling layer and when we follow some method as second layer we get following output. This time our stride is 64 while every other parameter for our calculation is same as before, so without further ado, we apply our formula.
((3 * 3) * 64 + 1 ) = 36,928
I hope this blog helped you in learning how you can calculate parameters for Convolutional Neural Network.