Introduction to Super-Resolution Machine Learning Models

Source: Deep Learning on Medium

How does it work?

In technical terms, super resolution is an ill-posed problem because for a single degraded image, there are multiple possible upscaled (HR) images. Simple algorithm-based approaches use the local information in a LR image and compute the corresponding HR image. These have been used for a long time, but the results they produce are inefficient.

In supervised machine learning approaches, the model is trained to learn the mapping functions from LR to HR images on a large dataset. If the degradation function used is known, a large training set of LR and HR images can be created, since LR images can be directly extracted from HR images already available.

If the degradation function is unknown, the collection of a training set is a difficult task because pairs of already-existing HR and LR images are needed. In this case, unsupervised learning may be used to approximate the degradation function.

The mapping function learned by the model is the inverse of the degradation function applied on the HR image.

Model Training and Design

Many different approaches have been applied to train super-resolution models using various model architectures. These approaches differ in terms of which stage in the network the upsampling is done.

Earlier super-resolution models used a pre-upsampling approach, in which the LR images are first upscaled to coarse HR images using traditional algorithms, and then CNNs are used to learn the mappings from these coarse HR images to the desiredHR images.

pre-upsampling approach (source)

In the post-upsampling approach, the LR images are passed to the CNNs and the upsampling layer is at the end of the network. The upsampling layer is learnable and trained together with the preceding convolution layers in an end-to-end manner.

post-upsampling approach (source)

The post and pre-upsampling approaches are the most-used ones; however, there are others that have been experimented with, such as progressive upsampling and iterative upsampling, both of which are more efficient but at the same time more complex.

Along with these upsampling approaches, the network design and types of convolutions used are also very important. Super-resolution models require that the information of LR image is preserved in the HR image—therefore, they mainly focus on the residual of the LR and HR images, which is why residual network design with skip connections are used in most of the networks.