Analyzing data augmentation for image classification

Original article was published on Artificial Intelligence on Medium


If we process the augmented images above and project it to the same 2D vector space as the previous cat-elephant images, we can see that the new dots are around the original image’s. This is the effect of image augmentation:

Augmentation expands the single point of an elephant image in the classification space to a whole area of elephant images.

Augmented elephant images in the cat-elephant projection

One-shot approach

When one has very few samples in a label class, the problem is called few-shot learning, and data augmentation is a crucial tool to solve this problem. The following experiment tries to prove the concept. Of course, here we have a pre-trained model on a large dataset, so it was not learned in a few-shot training. However, if we try to generate a projection using only one original elephant image, we can get something similar.

For this projection, we will use a new PCA using the original elephant image, and it’s augmented images. The augmented images in this subspace are shown in the following image.

PCA projection of the augmented images

But can this projection, using only one elephant image separate the elephants from the cats? Well, the clusters are not as clear as in the previous case (see the first scatter plot figure), but the cats and the elephants are in fact in different parts of the vector space.

Elephants and cats in the PCA projection generated from one elephant image

Summary

In this story, we illustrated the effect of the data augmentation tools used in the state of the art image classification. We visualized images of cats and elephants and the augmented images of an elephant to understand better how the model sees the augmented images.

References

[1] Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248–255). Ieee.

[2] Cubuk, E. D., Zoph, B., Shlens, J., & Le, Q. V. (2020). Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 702–703).

[3] Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700–4708).

[4] Pearson, K. (1901). LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559–572.

[5] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6), 417.

[6] Ho, D., Liang, E., Chen, X., Stoica, I., & Abbeel, P. (2019, May). Population based augmentation: Efficient learning of augmentation policy schedules. In International Conference on Machine Learning (pp. 2731–2741).