Everything you need to know about Generative Adversarial Networks


By- Sudip Bhandari

Adversarial training is the coolest thing since sliced bread. — Yan LeCun

Little preface

A little background first. Traditional neural networks have an interconnected layer of computing nodes where in the first layer various input signals of the training data are fed and in the final layer are select number of nodes(each representing a class/category) each representing the probability of given data being into that certain category. Between the first and the last layer are n numbers of additional layers where input signals flow from one layer to another based on some cost function and other computations that follow along the connections (represented by weight of the connection). Based on final output, in the training phase, error is calculated. This error is now propagated backwards adjusting the weights of the connections in order to optimized the result. Training and backpropagating the errors continuously makes the weights optimal at a certain point and we have a neural network model. Adversarial networks, on the other hand, employ two neural networks competing each other. Although the idea of adversarial networks had been there for a while it came to limelight after the research done by the then OpenAI scholar Ian Goodfellow.

(Image Source)

Two competing neural networks:

Two competing networks in this unsupervised learning approach are: generator and discriminator. The aim of generator is to create/synthesize some representation of input data using random seed and present it to discriminator. The discriminator network then tries to flag such generated data as fake. This flagging happens based on already present dataset and other sources. Now this becomes a classic example of Game Theory. Generator network tries to optimize and fool discriminator as much as possible and meanwhile discriminator network also optimizes itself in order to be better at flagging the fake data generated. Ultimately the nash equilibrium in this scenario is achieved where generator becomes so good that discriminator would always be unsure whether it’s input are real or not.

(Image source)

Difference with traditional neural networks

The major difference lies in the need of cost function. Traditionally cost function has to be carefully designed by human engineers. But adversarial networks learn their own cost function (based on each other’s feedback). Discriminator network tries to learn the boundary between the classes so that it can flag the fake data. Generator network tries to learn the distribution of class. This could be a probability distribution of signals in real data so that it can simulate fake data points.

Using GAN to supplement training data for DNN

Deep neural networks notoriously demand a very huge data set for training in order to work properly. Collecting huge dataset is not always easy and this is where GAN comes into rescue. Using GAN we can create our own dataset based on examples. Although the quality of GAN in such tasks is not yet very supreme people have already been employing this approach in various machine learning tasks. Handwriting recognition is one such example. Based on limited handwriting excerpts we can employ GAN to amplify our data set and train our DNN with a better performance.

How GAN philosophically relates to human learning:

There is this Deliberate practice theory about how human beings learn efficiently. The gist is instead of just practicing if we focus on one particular aspect of what we are weak at focus on that deliberately we can significantly better at doing a task in hand. This means having a strong critic constantly reviewing our performance. In many ways this is the underlying philosophy behind GAN too. Facebook, one of the early adopters of GAN has been using this to achieve some sort of higher order reasoning in AI, apart from normal ML tasks like image recognition.

Generator Improvement:

Current state of the art Generator network performance is not that great. In order to improve performance Deep convolutional GAN have been used. In this approach each successive layer works on features simple to increasing complexity. From raw pixels and motifs to complex shapes and textures. This has brought about significant improvement in the performance of Generative networks.

Exciting Applications:

  • Generating video frames continuation by learning from existing frames
  • Building chatbots that are more useful and more engaging (This can be thought of as Turing learning where each party is trying it’s hard to outsmart another strengthening both as the result)
  • Single Image Super Resolution where a high resolution image is constructed from a low resolution image. This has outperformed earlier nearest neighbor approaches
  • Image generation from descriptions
(Research Paper Link)
  • Predicting which drug can treat a certain disease (Demo)
  • Retrieving Images from a given pattern

About Author | Sudip Bhandari

I have been working as a software engineer for about two years now. I have finished my undergraduate studies in BE Computer Science. Currently pursuing Udacity’s Machine Learning Nanodegree. Apart from programming I am interested in music. I want to be a ML researcher and build ML solutions to various problems.

Follow Sudip on Medium, Twitter, and Quora.

Source: Deep Learning on Medium