Convolutional Neural Network: Hyper-Parameter Tuning with Hyperas (HackerEarth)

Original article can be found here (source): Artificial Intelligence on Medium

Photo by Nicholas Lazarine on Unsplash

Convolutional Neural Network: Hyper-Parameter Tuning with Hyperas (HackerEarth)

After XGBoost hyper-parameters tuning, it is now the turn for CNN and general Neural Networks.

While giving the source code might be already useful, what I will do, however, will be to explain the codes that I wrote parts by parts, so that you can tweak the options and manage to debug by yourself.

Here is the full source code:

Introduction to the study case

The goal of this CNN Model is to classify cropped pictures into 4 categories that are:

['Attire','Decorationandsignage','Food','misc']

This dataset originates from the Gala Detection challenge from HackerEarth. In our notebook, we do a simple data analysis of the dataset. (If you happen to be interested then write to me so that I can send it to you)

There are two parts that constitute the full code for this Hyperas method which are Data Preparation and CNN Model:

Data Preparation (all in a function that we will call later on)

In this part, we need to put all the data loading, data preprocessing in this function such that its output can be immediately fed to the Neural Network.

Things to notice:

  • Global variables need also to be put inside of the data() function, you might be working on a notebook and think that previously assigned variables can be found on the scope, but that is not the case.
  • The data() function is fortunately only called once when given to the optimizing function.

CNN Model: Specific to Hyperas

Here is a sample/version of the CNN model. Quick description of the architecture, we have here:

  • The input layer has shape (48,48,1)
  • 4 to 5 Convolution Layers (with filters (3×3) and (5×5)) (coupled with normalizations)
  • 2 MaxPools (2D) and 2 Dense Layers
  • We use only ReLu activation on hidden layers and softmax for the output layer
  • We also try the different NN optimizers that are Adam, SGD etc…

With these settings, the hyper-parameters that we are trying out will be, tweaking:

  • From Dropout(0.25) we change it to Dropout(): this allows to find in the interval of [0,1] the best parameter for train loss in a random fashion (meaning that we sample and do not try every single value between 0 and 1)
  • From Dense(256) to Dense(): This allows us to pick the best parameters in regards again of train loss for this FC layer.
  • Putting optimizer= in the optimizer field will get us the best optimizer among these 3 choices.
  • batch_size= works similarly to the previous ones

We decided to optimize these parameters but obviously you could try to modify others hyper-parameters.

Optimization and Fitting

The model now set, let’s optimize all the previously stated hyper-parameters.

The function minimize() takes as argument the model function and the data function. In order to get the best hyper-parameters values, you simply have to do print(best_run)).

Possible bugs during implementation depending on your settings

Here are some possible bugs that you might encounter and the way to deal with them.

  • TypeError: requires string label | You need to wrap your code in a def create_model(...): ... function, and then call it from optim.minimize(model=create_model,... like in the example.
  • TypeError: ‘generator’ object is not subscriptable | just run pip install networkx==1.11
  • “No such file or directory” or OSError, Err22 | add the following piece of code in the optim.minimize just as we did for our code: notebook_name=’{your_notebook_name}’

If you want to learn more about Hyperas I highly advise you to check out its original source code.