Downsampling and Upsampling of Images — Demystifying the Theory

Source: Deep Learning on Medium

While this article is mostly oriented towards the technical side (Machine Learning and Deep Learning enthusiasts and practitioners), it is not limited there. The concepts mentioned here are also useful in any image processing domain including image photoshopping.

Background

A computer understands things better in the numerical format, whether it has to do a mathematical calculation, work with multimedia, texts or signals, all these are represented in the computer in the form of numbers. The question in the subject here is the resizing of images.

Think of how images are stored on a computer! The image is broken into tiny elements called pixels. Each pixel represents one color. Therefore, an image with a resolution of 1024 by 798 pixels has 1024 x 798 pixels (817,152 pixels). That means that many color points in a matrix form.

So let’s start with the relatively easier fo the two:

Downsampling

After reading the name of this technique one intuitively gets an idea that it has got something to do with the downscaling of the image. Well True! The idea is right, we have to someone downscale the image for various reasons like:

  • It makes the data of a more manageable size
  • Reduces the dimensionality of the data thus enabling in faster processing of the data (image)
  • Reducing the storage size of the data

There are also some other uses of this technique depending on the usage.

It is sometimes confused with image compression which is a different thing and serves a different use altogether. Here we are concerned with just the shrinking of the image. Well, what does that mean? That essentially means throwing away some of the (non-essential) information.

From this, we can draw a hint that we need to discard some of the rows and/or columns from the image. We need to give away some of the information.

There are many algorithms used in various techniques for downsampling, namely:

Upsampling

Upsampling, on the other hand, is nothing but the inverse objective of that of downsampling: To increase the number of rows and/or columns (dimensions) of the image. This can be used in several cases like the one used in GANs (Generative Adversarial Network) where the intention is to construct an image out of random vector sample mimicking an image from the ground-truth or real distribution. There are many others like improving the quality of the image and so on. Let’s discuss this in more detail.

When downsampling, our intention was fairly simple and clear but with upsampling it is not that simple. We need to somehow increase the dimensions of the image and fill in the gaps (columns/rows). Suppose you want to upsample the original image by a factor of 3, this means, you need to add 2 more rows/columns for each row/column in the image using some logic. One way could be to just repeat each column/row in the original image.

Image source: giassa.net

If you were to do it this way, interestingly enough, you would observe that the two images: the original image and the resulting image look quite similar if not identical. To drive the point home, you have not created any “new” data in the resulting image. Since the duplicated rows and columns are completely redundant, this method is useless and it does not provide any new information.

A sensible approach to adding the new columns will be to interpolate the new data between the rows/columns which provide a reasonably accurate intermediate value using some advanced mathematical produces.

Examples of some of these algorithms are: