Original article was published by Aakash kaushik on Artificial Intelligence on Medium
Convolutional Neural Network (CNN) in C++
There are a lot of Convolutional Neural Network articles out there explaining what a CNN is and what are it’s uses. This articles doesn’t focus on that. Today we are going to code up a CNN in C++ with a library called mlpack to classify the MNIST dataset.
You may ask why C++ when it’s easy in python with a plethora of libraries, You probably have seen some Tesla cars by now those kind of systems require real time inference from their environment and python is great for prototyping but doesn’t provide real-time updates when such huge models are deployed with it.
It’s a machine learning library written in c++ that makes use of some other underlying libraries to provide fast and extensible cutting edge machine learning and deep learning methods. You can learn more at https://www.mlpack.org/
The data we are going to use is contained in a CSV file and consists of digits images from 0 to 9 where the column contains the labels and the rows contain the feature, but when we are going to load the data into our matrix the data will be transposed and the labels mentioning which feature is what will also be loaded so we need to take care of that.
First we will start by including some libraries and functionalities that we need to define our network.
Then we are going to declare a helper function to convert the model output into row matrix to match with our loaded labels which are in the form of row matrix.
Below this part the code is going to be in the main function but it’s not written to make the code easy to explain in parts.
Now we will declare some of the obvious training parameters that we will need and i will explain the ones which stand out.
The parameter MAX_ITERATIONS is set to 0 because this allows us to iterate infinitely in an epoch to use early stopping later in the training phase. As a side note early stopping can also be used when this parameter is not set to 0.
Lets process and remove the column which describes what is contained in each of the rows as i described in the data part and make a separate matrix for labels and features for the training, validation and testing set.
We are going to use the Negative Log Likelihood loss and in the mlpack library the labels for it start from 1 instead of 0 so we added 1 to the labels.
Now let’s take a look at the simple Convolutional Architecture that we are going to define.
I will present the other details through the code.
Let’s train……. Nope it Won’t take that long 🙂
As you can see the use of EarlyStopAtMinLoss on the validation accuracy, that’s why the parameter MAX_ITERATIONS was set to 0 to let us define infinite iterations.
The time we all have been waiting for, Prediction time !!
But first let’s see how.
This is going to print the Training, validation and test set accuracy by comparing our labels with the predicted labels.
Your Friend might want to try your model
That’s how you save it, by the way be ready for some criticism by your friend 😉
And this is how your friend loads the model.
Now that’s where the main function ends, In case you forgot it was c++.
The purpose behind writing this article was to Increase the ever growing love and interest for C and C++ and making people think about how some of those libraries such as PyTorch or TensorFlow work beneath that beautiful python code.
- More C++ model examples.
- Learning C++, I personally advice going with the book Accelerated C++ for starters and also checkout this book guide on stack overflow.