Domain Transfer Network (DTN)

Source: Deep Learning on Medium


Transfer from face to emoji

This article illustrates Domain Transfer Network in face domain. This article also explains method to transfer set of random and unlabed face images to set of emoji images.

I have used set s of one million images without label .

Architecture

The Domain Transfer Network

Block f is feature encoder, block g is generator and block D is discriminator. Function f is typical Generative Adversial Network(GAN). This function should remain unchanged. f is pretrained and remains constant during training.

The loss function contains generator’s loss and discriminator’s loss. The generator loss function has four terms: Lgang, Lconst,Ltid,Ltv. The equation is :

Lg = Lgang + αLConst + βLtid+ γLtv

  • Lgang measure how well generator tricks discriminator
  • Lconstant keeps f(G(s) and f(s) identical.
  • Ltid wants G to remain the identity mapping when it takes in some t from the target domain.
  • Ltv smooths the generated image by pixel

Note: For calculation purpose cross entropy loss can be used for Lgang and Ld.

Feature Encoder Block

Internally block f is pretrained OpenFace(Torch implementation of face recognition) which outputs a 128d vector representation of the input image which is trained so that similar faces are together in the feature space.

Generator Block

  • 5 blocks each consisting a stride 2 transposed convolution followed by batch normalization and a ReLu
  • 1 x 1 convolution is added after each block to lower Lconst
  • Final transposed convolution by a Tanh output layer is performed to ensure outputs between -1 and 1
  • Filters in each block can be varied for experimental purpose

Discriminator block

  • Similar to Generator block , it has 5 blocks
  • Each block contain stride 2 convolution, batch normalization and leaky ReLu non-linearity α = 0.2.
  • The final output is convolution with three filter

GAN Training Strategies:

Some of the strategies which can be applied apart from the above section

Balance of discriminator and generator:

Balancing is key in traing a GAN . Generator is usually overpowered by discrimnator. Method for solving this include:

  • Train generator more than discriminator
  • Adjust Hyperparameters like adding higher weight for generator loss
  • Add a lower bound on discriminator loss
  • Change model architecture.

Normalization

Instead of normalizing input image into standard normal, normalizing images between [−1, 1]. Use Tanh as the last layer of the generator

Avoid sparse gradients

Instead of downsampling by maxpooling, use strided convolution. Use LeakyReLU instead of ReLU in discriminator.

Optimization parameters

Change learning rate schedule. Use SGD for discriminator and Adam for generator. Change weight decay.

Conclusion

This article explains the problem of unsupervised domain transfer. It also explains the ability to use domain transfer in order to perform unsupervised domain adaptation.

Reference

  1. Unsupervised Cross-Domain Image Generation
  2. Domain Transfer Net