My First Encounter with GANS

Source: Deep Learning on Medium


GANs, or Generative Adversarial Networks, is a type of neural network architecture that allows neural networks to generate data

Screenshot from my Notes

GANs are generative models devised by Goodfellow et al. in 2014. In a GAN setup, two differentiable functions, represented by neural networks, are locked in a game. The two players (the generator and the discriminator) have different roles in this framework.

The generator tries to produce data that come from some probability distribution. That would be you trying to reproduce the party’s tickets.

The discriminator acts like a judge. It gets to decide if the input comes from the generator or from the true training set. That would be the party’s security comparing your fake ticket with the true ticket to find flaws in your design.

How we create a generator conceptually

  • First generators take a sample from latent space and create a relationship between latent space and output
  • Create a neural network that takes input as noise and creates the image.
  • we will train the generator in the adversarial mode where we connect generator and discriminator
  • The generator can be used for inference after training.

How we create a Discriminator conceptually

  • Build a Convolutional Neural Network to classify real or fake (binary)
  • Create a dataset of the real dataset and fake dataset by the generator
  • Train discriminator on real and fake data
  • Learn to balance training of discriminator with generator

Enough theory right, Let’s try to build a simple generative adversarial network and learn concepts alongside.

Import Libraries

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, LeakyReLU, BatchNormalization
from keras.optimizers import Adam
from keras import initializers
import numpy as np
# for visualizations
%matplotlib inline
import matplotlib.pyplot as plt

Loading and Preprocessing of Dataset

(X_train,_),(_,_) = mnist.load_data()
X_train = X_train.reshape(60000, 28*28)
X_train = (X_train.astype('float32') / 255 - 0.5) * 2

While loading dataset we only define a variable X_train why? Because we only concerned with it. Simple isn’t it? Next, we reshape our dataset from (60000,28,28) to (60000,784) and shift our pixels from 0–255 to (-1)–1.

Building block — Generator

Input: Input to the generator is a series of randomly generated numbers called latent samples.

Processing: It tries to produce data that come from some probability distribution. In nutshell, the generator network takes random noise as input, then runs that noise through a differentiable function to transform the noise and reshape it to have a recognizable structure.

Output: The output of the generator network is a realistic image. Without training, the generator produces garbage images only.

Generator try to fool by generating real-looking images

latent_dim = 100
# image dimension 28x28
img_dim = 784
init = initializers.RandomNormal(stddev=0.02)

# Generator network
generator = Sequential()

# Input layer and hidden layer 1
generator.add(Dense(128, input_shape=(latent_dim,), kernel_initializer=init))
generator.add(LeakyReLU(alpha=0.2))
generator.add(BatchNormalization(momentum=0.8))

# Hidden layer 2
generator.add(Dense(256))
generator.add(LeakyReLU(alpha=0.2))
generator.add(BatchNormalization(momentum=0.8))

# Hidden layer 3
generator.add(Dense(512))
generator.add(LeakyReLU(alpha=0.2))
generator.add(BatchNormalization(momentum=0.8))

# Output layer
generator.add(Dense(img_dim, activation='tanh'))

Note: Complete code available on my Github repo

Latent dimensions are dimensions which we do not directly observe, but which we assume to exist (Hidden)

We use this in reference to the generator, it creates images from the latent dimensions which we do not directly observe, but assume to exist

Some Basic things

Let me tell you a few more basic things which I use in the generator model. If you are aware of these terms then skip this part.

Initializers

It is statistical distribution or function to use for initializing the weights. 
Why we need this? The neural network needs to start with some weights and then iteratively update them to better values.

LeakyReLU over ReLU

The LeakyReLU removes the problem of “dying ReLU”. For deep dive in the activation functions.

Batch normalization

“Momentum” in batch norm allows you to control how much of the statistics from the previous mini-batch to include when the update is calculated

Building Block – Discriminator

  • The discriminator is a classifier trained using supervised learning.
  • It classifies whether an image is real (1) or is fake (0).

Basic Idea

When we feed a latent sample to the GAN, the generator internally produces a digit image which is then passed to the discriminator for classification. If the generator does a good job, the discriminator returns a value close to 1. However, the generator initially produces garbage images, and the loss value is high. So, the back-propagation updates the generator’s weights to produce more realistic images as the training continues.

Discriminator tries to distinguish between real and fake images

discriminator = Sequential()

# Input layer and hidden layer 1
discriminator.add(Dense(512, input_shape=(img_dim,), kernel_initializer=init))
discriminator.add(LeakyReLU(alpha=0.2))

# Hidden layer 2
discriminator.add(Dense(256))
discriminator.add(LeakyReLU(alpha=0.2))

# Hidden layer 3
discriminator.add(Dense(128))
discriminator.add(LeakyReLU(alpha=0.2))

# Output layer
discriminator.add(Dense(1, activation='sigmoid'))

Combining Blocks

  1. Set the discriminator trainable
  2. Train the discriminator with the real MNIST digit images and the images generated by the generator to classify the real and fake images.
  3. Set the discriminator non-trainable
  4. Train the generator as part of the GAN. We feed latent samples into the GAN and let the generator to produce digit images and use the discriminator to classify the image
discriminator.trainable = False

d_g = Sequential()
d_g.add(generator)
d_g.add(discriminator)
d_g.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])
  • Sets its trainability to False, meaning that during the adversarial training, it will not be training
  • The reason behind set trainable=False, Generator is consistently getting better, but discriminator will remain the same.
Discriminator Architecture

Training

It’s like Two player game: The Generator (forger) needs to learn how to create data in such a way that the Discriminator isn’t able to distinguish it as fake anymore. The competition between these two teams is what improves their knowledge until the Generator succeeds in creating realistic data.

for e in range(epochs + 1):
for i in range(len(X_train) // batch_size):

# Train Discriminator weights
discriminator.trainable = True

# Real samples
X_batch = X_train[i*batch_size:(i+1)*batch_size]
d_loss_real=discriminator.train_on_batch(x=X_batch,y=real*(0.9))
 # Fake Samples 
z = np.random.normal(loc=0,scale=1,size=(batch_size,latent_dim))
X_fake = generator.predict_on_batch(z)
d_loss_fake = discriminator.train_on_batch(x=X_fake, y=fake)
 # Discriminator loss
d_loss_batch = 0.5 * (d_loss_real[0] + d_loss_fake[0])

# Train Generator weights
discriminator.trainable = False
d_g_loss_batch = d_g.train_on_batch(x=z, y=real)

Well, let’s try to understand the code

  1. In real samples block first, we define the size of batches i.e. size of one batch is 64. Pre train discriminator on fake and real data before starting the gan. It helps us to check if our compiled models run fine on our real data as well as the noised data.
  2. In fake samples block, we generate random noise as an input to initialize the generator. And then generate fake MNIST images from noised input followed by training discriminator on fake images generated by the generator
  3. Discriminator loss concept is that we only grabbed half the number of images that we specified with the real loss, we take other half images from our generator for the other half of the batch
  4. During the training of gan, the weights of discriminator should be fixed. We can enforce that by setting the trainable flag. Then we’ll train the GAN with mislabeled generator outputs ([z=Noise] with [real i.e. 1]). But Why ? we are using this newly trained discriminator to improve generated output. GAN loss is going to describe the confusion of discriminator from generated outputs.

Note : train_on_batch runs a single gradient update on a single batch of data.

Initial Result
Final Result

Great! Are these images perfect, no, but for a very little amount of effort they are not too bad (at least in my opinion!). I find it truly amazing that a neural network is able to learn how to generate images. GANs are a really exciting area of research that is starting to break the assumption that computers are not capable of being creative. Again, check out the code for this blog post on my GitHub or Kaggle kernel and be on the lookout for future posts on GANs and other machine learning topics!

To understand concepts related to backpropagation and deep dive in the min-max game of GANS, Read this article by Jonathan Hui

Reference — Paper on Generative Adversarial Nets