Source: Deep Learning on Medium
As one of the best known CV papers in 2017, the pix2pix paper (“Image-to-Image Translation with Conditional Adversarial Networks”) as its name suggests has come up with a general solution to tackle the problem of translating from one image to the other. Be it B&W to colored, aerial to map, or sketches to pictures. The paper provides an one-size-fits-all solution to rule them all.
This article serves as a memo to help me review the ideas without having to go through the paper again in the future.
Highlights of the paper:
- cGAN is used to help the discriminator D to give a more structured loss.
- A U-Net-based architecture for the generator G. For many image translation problems there is a great deal of low-level information shared between the input and output so the skips can help quite a bit.
- Convolutional PatchGAN classifier is used for the discriminator D that penalizes structure at the scale of image patches. The N x N patch size can be really small and still produce high quality results.
- Unlike the other more “traditional” GANs, pix2pix doesn’t provide gaussian noise z but only use provide noise in the form of dropout at both training and test time.
GAN Losses Summary: