Zero Shot Super Resolution Part 3: Training the Model –

Source: Deep Learning on Medium

Go to the profile of Shahar Guigui

In part 1, we reviewed Single Image Super Resolution (SISR) methods and Zero-Shot Super Resolution in particular. In part 2, we discussed how to run the code example with Keras, a TensorFlow backend, OpenCV and a free account on The applications for this technique range from medical imaging, working with compressed images, agriculture analysis, autonomous driving, satellite imagery, reconnaissance and more. Now, it’s time for us to go through the code, discuss how it works and highlight ways to modify it for future experiments.

By this point, you should have already been able to run the project. If you have not set it up, please review part 2 or the ReadMe file. We’ll be highlighting specific blocks of code from the file to better explain what is going on under the hood. As a quick recap, you’ll need to clone the project, install the dependencies, connect the account to the project and preview the experiment in real-time on the dashboard.

When you open the file, you’ll see a collection of Python Libraries we’ll be using. These will include NumPy, cv2, Keras, and MissingLink:

From part I and II, you should be familiar with most of these but if this is your first time working with the SDK it’s a platform for deep learning workflow automation. The SDK enables fast and efficient development of complex deep learning models by taking away the pain of managing numerous experiments, large datasets, local and off-premise machines, and code versioning. We’ll be leveraging it to track real-time values of our experiment and later on, for more advanced users, we’ll be digging into MissingLink’s data management features.

Inside of the main program, we set an argument parser for configuring parameters from the CLI, get the paths for the output directory and for the target image and then load it. We also define TensorFlow as our backend and the image dimension ordering as channels last. Different Keras backend libraries (TensorFlow, Theano) have these set in opposing ways, i.e channels last vs. channels first respectively. By making sure the setting is as such we can dodge bugs from feeding our neural network with the wrong data shape.

We first define the MissingLink callback. Then we use it to set properties of the experiment such as display_name and its description.

These will be displayed on the dashboard and help us to tell experiments apart as well as keep track of useful data.

We can change the description from the UI post-training as well. It allows us to add important data and run an “experiment notebook” type of research.

We call the build_model() function and get the returned model to the ZSSR variable.

lrate inherits from the Keras LearningRateScheduler callback which gets the step_decay function and updates its value by it.

We start training by calling zssr.fit_generator() which is fed by data from the custom image_generator. We save the model to the Data Management artifact utility. We then call the predict_func() and accumulated_result() function to get our super-resolution outputs.

In the cases where we have a Ground-Truth High-Resolution image, we check our results in respect to it by using the metrics we previously defined: PSNR and SSIM.

We can use the SAVE_AUG flag to save some examples of our training samples:

These are two pairs of hr-father and lr-son samples. We can see they were rotated 90 degrees left and a considerable amount of noise was added to the lr-son.

As mentioned before this is done to make the net learn how to fix noise and to augment (enlarge) the dataset.


The low-resolution input (subdir 034):

This is a simple interpolation with an SR_Factor of 2.

This is our super resolution output (1000 epochs w. Noise standard deviation set to 30):

*Notice the considerable reduction in noise and grain.

Here is the ground-truth image:

Training examples for subdir 067 (no noise was added):

The low-resolution input (subdir 067):

This is a simple interpolation with an SR_Factor of 2.

This is our super resolution output (1000 epochs w.o noise):

* The improvement is gentle but when taking to effect the considerably shorter training time and the much smaller size of the neural network it is quite impressive.

Lets try to run it again but with Noise standard deviation set to 30:

And with the noise std set to 70:

We see some improvement but what might really help is not adding Gaussian noise to the lr-sons but stronger JPEG artifact.

The ground-truth image:

Let’s set some useful parameters. These are global parameters we set up outside of main(). By tweaking with their values we can change the behavior of our architecture and algorithm dramatically.

For instance: SR_FACTOR determines the increase in resolution for our target image, While NOISE_FLAG is responsible for adding or not adding noise to our low-resolution samples. Finally, as we tune and change these parameters for different configurations, we can use MissingLink’s dashboard to compare the outcomes:

The image specificity of the model requires tuning per image to achieve optimal results. This can be done quickly by hand or automatically by a short script. All the most important hyperparameters such as epochs, filters and noise are configurable by a parser on the command line. The model achieves higher results than the state of the art EDSR model on many samples (try subdir 067 with 1000 epochs). This can be seen by the ZSSR vs. EDSR ratio on the PSNR qualitative test (metrics_ratio=1.03108).

*Notice we receive lower metric scores on psnr than in the original article, even on the EDSR bench-mark examples. This must be due to differences in the implementation of psnr in sklearn versus matlab (which the writers used).

In the next part we will dive into the code implementation and explain the functions further.

Originally published at