### Inception Score (IS)

IS uses two criteria in measuring the performance of GAN:

- The quality of the generated images, and
- their diversity.

Entropy can be viewed as randomness. If the value of a random variable *x* is highly predictable, it has low entropy. On the contrary, if it is highly unpredictable, the entropy is high. For example, in the figure below, we have two probability distributions *p(x) *and *p1* has a lower entropy than *p2*.

In GAN, we want the conditional probability *P(y|x)* to be highly predictable (low entropy). Given an image, we should know the object type easily. So we use an Inception network to classify the generated images and predict *P(y|x)*. This measures the quality of the images.

*P(y)* is the marginal probability

If the generated images are diverse, the data distribution for *y* should be uniform. The figure below visualizes this concept.

To combine these two criteria, we compute their KL-divergence and use the equation below to compute IS.

**Frechet Inception Distance (FID)**

In FID, we use the Inception network to extract features from a specify layer. Then we model the data distribution for these features as a multi-variate Gaussian distribution with mean *µ* and covariance *Σ*. The FID between the real images *x* and generated images *g* is:

where *Tr* sums up all the diagonal elements.

FID is more robust to noise than IS and if the model only generate one image per class, it will have a high IS but not for FID.

### Reference

Source: Deep Learning on Medium