Source: Deep Learning on Medium
PART 1. GAN: What Is It?
GAN has a very simple task to do, that is, to generate data from the scratch, data of a quality that can fool even untrained humans.
Invented by Ian Goodfellow and colleagues in 2014, this model consists of two Neural — Networks(Generator and Discriminator) competing with one another resulting in the generation of some authentic content.
The purpose of two Networks may be summarised as to learn the underlying structure of the input database as much as possible and using that knowledge to create similar content which fits all the parameters to fit in the same category.
As shown above, the input was that of human faces, where it learned exactly what it is that makes a human face, well, human. Using that understanding it generated random human faces which otherwise might have been real as well.
Let’s understand a bit more about it in detail:
This image is an oversimplified architecture of GAN, but it captures the complete essence of the concept.
This is what happens in a single iteration of GAN:
- Generator gets a random noise vector as input
- After passing through the generator which performs multiple transposed convolutions to upsample the noise to generate the images.
- It gets random input from either the Real Word Sample(Real Sample) or Generated Images(Fake Sample).
- As the name suggests, it has only one job, whether the input was from “Real Sample” or “Fake Sample”
As users, we know if it was from the real or fake sample, and using this knowledge we can backpropagate a training loss in order for the discriminator to do its job much better.
But as we know, the Generator is a Neural-Network as well, so we can backpropagate all the way to the random sample noise and thus help generate better images. By doing this, the same loss function works for both, the discriminator and the generator as well.
The trick lies in balancing both of these networks during training. If done rightly, the discriminator will learn to distinguish even slight abnormalities while at the same time generator will learn to generate the most realistic outputs.
Technical Understanding of the Working of GAN:
The Generator and the Discriminator are in a mini-max game.
- Generator is trying to minimize the gap between the real and fake images so as to fool the discriminator.
- Discriminator is trying to maximize the understanding of real images so as to distinguish the fake samples.
In the above image, D(x) is nothing but the probability of an image being a “Real Sample” image.
Here there is another function G(z), which is nothing but the output of the Generator, z the random latent input. The probability of the generated image is from “Real Sample” is calculated by Discriminator as D(G(z))
For Discriminator we want:
- Real Sample images to be rightly identified, and so, D(x) must be close to 1
- At the same time, Fake Sample images to be correctly identified as well, and so, D(G(z)) must be close to 1
- Generator has no business with the accuracy of D(x), only D(G(z)) which must be identified as a Real Sample, and so, must be as close to 1 as possible.
This loss function is the backbone of the GAN Architecture, only by achieving a great balance between the two networks we get high-performing Generator and Discriminator.
For those of you, who are interested in learning more about GAN in detail: