Source: Deep Learning on Medium
Generative adversial networks or GANs are said to be one of the biggest invention in the field deep learning. Recently face-aging apps have become immensely popular which uses a variant of GAN as the underlying algorithm. Also GANs have become extremely useful in converting low resolution images into its higher resolution counter-part. In this article we shall see the building block of a basic generative adversial network.
Architecture Of GANs:
Each GAN has two basic elements i.e a Generator and a discriminator. This two network can be any deep learning network like artificial neural network, convolutional neural network,Long short term memory (LSTM). The discriminator is supposed to have a classifier at the end of the network.
The above diagram summarizes the working of a basic GAN. As we can see the generator takes random values as input and produces an output with the expectation that it looks real to the discriminator. On the otherhand the discriminator takes input from both the real image set and the generated image of the discriminator and tries to classify both of them in a correct manner i.e should be able to distinguish which is real or fake.
Layman’s Way Of Understanding GAN:
The generator of the gan is expected to create images that is supposed to look like real image. However the generator does not have any idea regarding how the real images look like. So it takes feedback from the discriminator (as it knows or claims to know about the real image) about how to tweak the parameters so that it would look real. Meanwhile the discriminator tries to identify the image generated by discriminator and to decrease its probability of becoming a real image and also increase the probability of classifying real image correctly. This competitive learning that is inspired from game theory makes both these network stronger.
Let us consider z to be a noise vector which is given as an input to the generator. In that case G(z) would be generator’s output. Also x is the set of training sample in that case D(x) is discriminator’s probability for the real training samples.Similarly D(G(z)) is discriminator’s output which is a probability value for generated image i.e for fake images.
Maximizing D(G(z)) is same as minimizing 1-D(G(z)) also minimizing D(G(z)) is same as maximizing 1-D(G(z)). Also as we can see in the discriminator part has two losses as it has two kinds of input the output from the generator and the real data samples. So loss has to be calculated twice. So the total loss at the discriminator would be sum of the two losses that we got due to real and fake data. After calculating the losses we can do the required backpropagation and adjust the parameters.
The above equation shows V as a value function which lets us adjust the parameters for both D and G.
As we apply the V function for both V and G
So with the above said operations both discriminator and generator become stronger to do their required tasks(Like fooling the discriminator for generator and distinguishing the real and fake images for the discriminator)
In the next article we shall see some cool applications of GAN and build a GAN from scratch.