Original article was published by Pivithuru Amarasinghe on Deep Learning on Medium
Fooling Deep Neural Networks Using Unrecognizable Images
DNNs are now able to classify objects in images with near-human-level performance. But questions naturally arise as to what differences remain between computer and human vision. Recent studies reveal that changing an image in a way imperceptible to humans can cause a DNN to label the image as something else entirely.
These kinds of images can fool advance systems like autonomous cars where someone can made the image recognition system of the car to identify a traffic light as green where actually it’s red, which can lead to an accident. Some people intentionally use these kinds of loop holes in image recognition systems to claim money.
Another method to fool a deep neural network is to produce images that are completely unrecognizable to humans, but well-trained DNNs recognize these images with 99.99% confidence.
Most of the times we get human unrecognizable images that can fool DNNs in day to day life. For example, think about a security camera that relies on face or voice recognition being compromised. Swapping white-noise for a face, fingerprints, or a voice can fool the system because people nearby might not recognize that someone is attempting to compromise the system.
These two are the main fooling techniques that have been developed to fool neural networks. In this series of articles we will look into how to develop these kinds of fooling images and ways to mitigate these loop holes. This article is about developing human unrecognizable images that a can fool a DNN.
There are several ways to develop human unrecognizable images that can fool DNNs. These kinds of images are called as evolved images. The focus of this article mainly will be on generating evolved images using evelutionary algorithms. Evolutionary algorithms mainly perform three basic genetic operations which are selection, crossover, and mutation.
Main two ways to generate evolved images using evolutionary algorithms are direct encoding and indirect encoding.
Each pixel value is initialized with uniform random noise within the 0 to 255 range. These pixels are independently mutated first by determining which numbers are mutated. The pixels chosen to be mutated are then altered via the polynomial mutation operator.
This happens using compositional pattern-producing networks(CPPN). This method tend to develop regular images as eveolved images. Regular images means that we can see patterns in the evolved image that were available in the original image.
How to mitigate this ?
In order to eliminate this behavior from DNNs, common method is to train networks to identify fooling images. We can train a DNN(DNN1) to classsify images and then we can generate images that can fool this DNN using evolutionary algorithmns. Then we can create a new data set from the original dataset that used to train DNN1 by adding a new image class as fooling class. By using this second dataset we can train a new DNN(DNN2). That DNN is capable of identifying these fooling images.
This concludes the first part of this article series. Hope to meet you quickly with the next part. Thank you!
 Goodfellow, Ian & Shlens, Jonathon & Szegedy, Christian. (2014). Explaining and Harnessing Adversarial Examples. arXiv 1412.6572.
 A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.