The most important goal while designing is how we scale our model, problems occur when we have to make our model ready for deployment.
Therefore after we finish prototyping we need to work on training with large datasets.
How to train a large dataset?
If we consider a simple use case where we have over 500 thousand matches each day, with our a database having over 100 days of data. That is 50 million matches to train and test on, our local machines are incapable of doing this simultaneously.
So the smart thing here would be to train on one day’s data and load the weights again when we wish to train on a new set of data. I have done this for 10 days of data to show how accuracy and loss changed over time.
For obtaining the above graphs, Day 1 data was trained and tested over Day 2 data for 1 epoch, in the next run Day 2 data was used for training and Day 3 data was used as testing data. This process continued until the last day. So we can see from the results, how more data can improve the model. This is one of the ways in which we can deal with large data and make our model ready for production line deployment.
I will now list some practices that can help you design a good model.
Improving Design of Deep Learning Model
Get Accuracy better than flipping coins.
If we consider a classification problem and if your model is classifying objects with a 50% accuracy when you have only 2 classes. It shows you need to update your model as it is not learning.
Try training on smaller datasets, this will reduce training time and help you reach a conclusion sooner.
Simplifying the network architecture also helps in understanding the model better.
Adding more layers increases pattern recognition by your model but they also slow down the model.
So if you are new in this field like me start with fewer layers and see if adding more layer improve the performance. Sometimes they are counterproductive so carefully design the model.
In deep learning, weight initialization also plays an important role in model efficiency.
There are few options available for weight initialization, test them with your model and see if they affect your model differently.
There are many options in selecting cost function for a model, and they have a huge impact on model output. This is because of their function in the model as they are the reason the weights get updated. Choose them wisely as per your design requirements.
One of the most rampant problems in Machine Learning is overfitting
If the network is overfitting validation accuracy begins to decrease but training accuracy keeps increasing.
So once validation accuracy is plateaued stop training this will prevent overfitting.
Using features like Dropout and Batch Normalizations we can reduce overfitting.
Always analyze the training process of your model, for example, the change in the cost of your model can indicate if you need to increase or decrease the learning rate.
Try to obtain a smooth cost curve.
A design trick here is to have a variable learning rate, at the beginning when there is much learning to be done. Later set a larger learning rate as you approach the end of training.
You can completely avoid this hassle by using Adam optimizer, as it adjusts the learning rate of the model on its own
As seen in the previous blog more number of epochs is not always better. Every model is different and so is the amount of training it requires.
The batch size decides how many times the weights get updated after training through the entire dataset. So if chose this parameter wisely as it can decide how accurate your model can become.
If the batch size is too large model weights won’t be updated enough and if they are very small we might lose the perfect set of weights to generalize the model.
Automate the Update
Doing hyperparameter tuning manually is time-consuming and laborious. And each time you change a single parameter you have to consider that affecting your model extensively and you might need to update other parameters accordingly.
A smart way to explore all possible changes of the outcome by updating these hyperparameters is to run the updates over a loop. And record the changes as and when you change any parameter.
These are few tricks I learnt while designing and training my model, I have gone through a series of lectures and tutorials to summarize these tips. If you think I have missed some information or if you would like to add some more techniques feel free to mention them in responses. If you think I have misinformed you in anyway, I am open to suggestions.
I will mention some resources that helped me understand deep learning better.
Good luck readers and happy learning.
Deep Learning with TensorFlow by Jon Krohn.
Deep Learning with Python by Francois Chollet.
Source: Deep Learning on Medium