Data augmentation: Not every hero wears a cape.


Data augmentation (not to be confused with feature engineering) is the process of creating more data rows out of the dataset in hand. It’s an easy way to make the training dataset up to 10–20 times bigger (depending on the problem in question).

One of the main factors in building a high performance ML model is the amount of data from which it’s going to learn, hence, naturally, the more data we have the better our model gets. But given we’re still in the early stages of AI, data is still very scarce (and unstructured).

Not every hero wears a cape, but like every hero, data augmentation has a flaw. Data augmentation is only useful for image and audio data (I haven’t come across another type of data to which it was successfully applied).

In our case, we’re dealing with image data, and usually what we can do to image data is apply rotations and translations while keeping data logic. And since we’re dealing with characters, applying a rotation more than 45 degrees would ruin the data logic. We’ll settle for 4 rotations (-10°, -5°, 5° and 10°) and 4 translations (8%, up, bottom, left and right).

This way, we’ll have a training set of 56547 up 9 times from the original 6283.

Below is the letter ‘M’ augmented as explained above.

(This article is a snippet from my upcoming work on the Julia dataset on Kaggle)

Source: Deep Learning on Medium