Fooling Deep Neural Networks Using Unrecognizable Images

Original article was published by Pivithuru Amarasinghe on Deep Learning on Medium


Fooling Deep Neural Networks Using Unrecognizable Images

DNNs are now able to classify objects in images with near-human-level performance. But questions naturally arise as to what differences remain between computer and human vision. Recent studies reveal that changing an image in a way imperceptible to humans can cause a DNN to label the image as something else entirely.

This is an image taken from [1]. Here researchers have added some random noise to the original image which is unrecognizable to the naked human eye. But a well trained deep neural network will identify this noise and classify the image as something entirely else.

These kinds of images can fool advance systems like autonomous cars where someone can made the image recognition system of the car to identify a traffic light as green where actually it’s red, which can lead to an accident. Some people intentionally use these kinds of loop holes in image recognition systems to claim money.

Another method to fool a deep neural network is to produce images that are completely unrecognizable to humans, but well-trained DNNs recognize these images with 99.99% confidence.

This image is taken from [2]. Here researchers have developed images that are completely unrecognizable for the naked human eye, but well trained neural networks still identify these images as belonging to their original class.

Most of the times we get human unrecognizable images that can fool DNNs in day to day life. For example, think about a security camera that relies on face or voice recognition being compromised. Swapping white-noise for a face, fingerprints, or a voice can fool the system because people nearby might not recognize that someone is attempting to compromise the system.

These two are the main fooling techniques that have been developed to fool neural networks. In this series of articles we will look into how to develop these kinds of fooling images and ways to mitigate these loop holes. This article is about developing human unrecognizable images that a can fool a DNN.

There are several ways to develop human unrecognizable images that can fool DNNs. These kinds of images are called as evolved images. The focus of this article mainly will be on generating evolved images using evelutionary algorithms. Evolutionary algorithms mainly perform three basic genetic operations which are selection, crossover, and mutation.

This is an image taken from [2]. This diagram shows how to create evolved images using the original image using evolutionary functions.

Main two ways to generate evolved images using evolutionary algorithms are direct encoding and indirect encoding.

Direct encoding

Each pixel value is initialized with uniform random noise within the 0 to 255 range. These pixels are independently mutated first by determining which numbers are mutated. The pixels chosen to be mutated are then altered via the polynomial mutation operator.

This is an image taken from [2]. These images were developed by applying direct encoding on MNIST data. As you can see there are no patterns related to the original image. This is because direct encoding independently mutes each pixel.

Indirect encoding

This happens using compositional pattern-producing networks(CPPN). This method tend to develop regular images as eveolved images. Regular images means that we can see patterns in the evolved image that were available in the original image.

This is an image taken from [2]. These images were developed by applying indirect encoding on ImageNet data. As you can see there are patterns related to the original image. This is because indirect encoding preserves patterns in the original image.

How to mitigate this ?

In order to eliminate this behavior from DNNs, common method is to train networks to identify fooling images. We can train a DNN(DNN1) to classsify images and then we can generate images that can fool this DNN using evolutionary algorithmns. Then we can create a new data set from the original dataset that used to train DNN1 by adding a new image class as fooling class. By using this second dataset we can train a new DNN(DNN2). That DNN is capable of identifying these fooling images.

This concludes the first part of this article series. Hope to meet you quickly with the next part. Thank you!

References

[1] Goodfellow, Ian & Shlens, Jonathon & Szegedy, Christian. (2014). Explaining and Harnessing Adversarial Examples. arXiv 1412.6572.

[2] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.