Original article was published by Sneha on Deep Learning on Medium
1. Is Data Augmentation overrated?
A possible red flag after applying an augmentation approach is when a model is no longer capable of preserving the label post the transformation. This might either lead the model into a false classification or rejection. For instance, the number 6 may be classified as a 9 or 3 as an unknown.
Flipping, cropping, zooming, etc., all sound familiar? Of course they do! Here’s why you must hold yourself from using them in situations they may prove to be damaging :
i. Flipping: It is the most frequently used and also proven to be effective on generic object classifications like the CIFAR-10 and ImageNet but not on text recognition like the MNIST or SVHN. By now you must have guessed why and you are probably right! It is evident that a text recognition must not undergo countless transformations as it might completely distort the text sequence making it unidentifiable.
ii. Cropping: Images which are focused on a single object with no background recognition can benefit from this method as most prime features are preserved. However, this method may not prove to be useful on a multiclass identification in a single image due to neglection of spatial dimensions and majority features.
iii. Rotation: The degree of rotation must be carefully set usually between ± 1° to ± 20° but increasing the value deteriorates the image and therefore, no longer preserves the label post the transformation.
iv. Grayscale: It cannot be denied that grayscale does often increase the computational speed as they often exclude a few unnecessary features and also result in smaller images. Eliminating crucial color features can result in a non label preserving post-transformation which has proven to reduce the accuracy by almost 3% according to Chatifled et al. . An alternative method is by transforming RGB to HSV (Hue, Saturation, and Value) , YUV* or CMY**.
2. The angels of Data Augmentation
Not all geometric transformations have a dark side and some when manipulated in the right way can bring exciting results.
i. Color space: Simple color augmentations by dropping one or two color channels can slightly decrease the brightness of an image thereby, somewhat preserving its label.
ii. Translation: Displacing images which are usually centered preserves the spatial dimensions post-augmentation by filling up the space with a random or a Gaussian noise or sometimes, a fixed value.
iii. Noise injection: Introducing a Gaussian noise into the matrix can improve the model as it assists the network in learning robust features.
3. Sisters from another mother
Data augmentation can be also referred to as an oversampling technique as it involves in increasing the data set. The SMOTE or the Synthetic Minority Oversampling TEchnique can be used as a substitute for the data augmentation. This involves creating synthetic samples to produce enough to match the class with maximum data samples. You can notice an accuracy improvement of ~3–5% on the trained samples by implementing the SMOTE algorithm. A far more superior technique which has proven to be effective over the SMOTE is class weights. This method is not sample dependent but, rather uses weights to manipulate and improve bias towards the minority class. This not only is easier to implement but also hastens the training process and subsequently lowers the computational resources utilized.
All in all, as much as an excessive amount of any item can be poisonous, so can a deep learning approach. The data augmentation technique has by far been the most effective and recognized tool in machine learning but cautiously observing the purpose, the model and naturally, the data set is the only way to create a nearly good and effective network.
Feel free to criticize in the comments section 🙂