Source: Deep Learning on Medium
At Earthcube, we chose to develop a super-resolution algorithm by using ResNets. As these networks use residual information (the difference between the output and the input of a layer) to learn how to add details to the original image, we find that the risk of the network “inventing” information not present in the original image is much lower than with GANs which reconstruct the whole image.
As a first step, we selected a training dataset composed of photos extracted from DIV2K, the traditional super-resolution challenge dataset. Interestingly, this training dataset did not contain any remote sensing images, which was a good way of seeing if our network would generalise well when we would apply it to our satellite images. We dedicated a lot of attention to the way we generated the low-resolution images from the high-resolution ones, as it is known to have a strong impact on the network’s performances.
We trained various networks, mainly based on the VDSR architecture, for different upscaling factors: 2, 4, 6 and 8. We found that it was generally better to have the highest possible ratio, in terms of the quality of the image produced. However, the higher the upscaling ratio, the bigger the super-resolved image gets. With that in mind, we found that a ratio of 4 was a good compromise: good quality with a reasonably-sized output image.
We chose to train the network using Adam optimizer, which works well for most of our use-cases. As for the loss, we used a simple L1 loss calculated between the upscaled image and the real high-resolution image. Obviously, there is room for improvements on this aspect and we are actively working on more sophisticated losses to guide the network to a better operating point. However, the L1 loss was surprisingly efficient to get very good results.