Coloring Black & White Images Using Deep Learning

Source: Deep Learning on Medium

Coloring Black & White Images Using Deep Learning

Hola Amigos,

Few days back, during winter vacation, I visited my grandparents’s home, and there I came across black & white images of my mother in her childhood, and I suddenly got an idea wondering how about we try to colorize this image. So, I started searching about this topic and related works done in research papers.

After few hours of searching, I finally found a method to colorize these images, and I immediately started working on this project and after an exhausting 6 hours, I finally got a working model, albeit the results are not as perfect as I had imagined. But nonetheless, I thought of writing this blog about my project. So lets see how to colorize black and white images :

Overview

Image colorization is the process of taking an input grayscale (black and white) image and then producing an output colorized image that represents the semantic colors and tones of the input (for example, an ocean on a clear sunny day must be plausibly “blue” — it can’t be colored “hot pink” by the model).

Previous methods for image colorization either:

  1. Relied on significant human interaction and annotation
  2. Produced desaturated colorization

The novel approach we are going to use here today instead relies on deep learning. We will utilize a Convolutional Neural Network capable of colorizing black and white images with results that can even “fool” humans!

Lets’ Get Started

Model Proposed By Zhang et al

The technique we’ll be covering here today is from Zhang et al.’s 2016 ECCV paper, Colorful Image Colorization.

Previous approaches to black and white image colorization relied on manual human annotation and often produced desaturated results that were not “believable” as true colorizations.

Zhang et al. decided to attack the problem of image colorization by using Convolutional Neural Networks to “hallucinate” what an input grayscale image would look like when colorized.

Now as we are getting intensity as input and we have to guess colors, we can’t use RGB color space as it doesn’t have information regarding illumination. Hence, we have two options : either use YCbCr color space or LAB color space as both Y and L channel encodes the information regarding illumination.

In this tutorial, we will be using LAB color space, but you can try using YCbCr space also.

Similar to the RGB color space, the Lab color space has three channels. But unlike the RGB color space, Lab encodes color information differently:

  • The L channel encodes lightness intensity only
  • The a channel encodes green-red.
  • And the b channel encodes blue-yellow

For more information, you can refer to This Wikipedia article.

Since the L channel encodes only the intensity, we can use the L channel as our grayscale input to the network.

From there the network must learn to predict the a and b channels. Given the input L channel and the predicted ab channels we can then form our final output image.

The entire (simplified) process can be summarized as:

  1. Convert all training images from the RGB color space to the Lab color space.
  2. Use the L channel as the input to the network and train the network to predict the ab channels.
  3. Combine the input L channel with the predicted ab channels.
  4. Convert the Lab image back to RGB.

For more detailed information, please refer to original paper by Zhang et al.

Colorizing black and white images with OpenCV

Our colorizer script only requires three imports: NumPy, OpenCV, and argparse .

Let’s go ahead and use argparse to parse command line arguments. This script requires that these four arguments be passed to the script directly from the terminal:

  • — image : The path to our input black/white image.
  • — prototxt : Our path to the Caffe prototxt file.
  • — model . Our path to the Caffe pre-trained model.
  • — points : The path to a NumPy cluster center points file.

With the above four flags and corresponding arguments, the script will be able to run with different inputs without changing any code.

# import the necessary packagesimport numpy as np
import argparse
import cv2
# construct the argument parser and parse the argumentsap = argparse.ArgumentParser()
ap.add_argument(“-i”, “ — image”, type=str, required=True,
help=”path to input black and white image”)
ap.add_argument(“-p”, “ — prototxt”, type=str, required=True,
help=”path to Caffe prototxt file”)
ap.add_argument(“-m”, “ — model”, type=str, required=True,
help=”path to Caffe pre-trained model”)
ap.add_argument(“-c”, “ — points”, type=str, required=True,
help=”path to cluster center points”)
args = vars(ap.parse_args())

Let’s go ahead and load our model and cluster centers into memory:

Now ,we load our Caffe model directly from the command line argument values. OpenCV can read Caffe models via the cv2.dnn.readNetFromCaffe function.Afterwards ,load the cluster center points directly from the command line argument path to the points file.

The middle few lines :

  • Load centers for ab channel quantization used for rebalancing.
  • Treat each of the points as 1×1 convolutions and add them to the model.
# load our serialized black and white colorizer model and cluster
# center points from disk
print(“[INFO] loading model…”)
net = cv2.dnn.readNetFromCaffe(args[“prototxt”], args[“model”])
pts = np.load(args[“points”])
# add the cluster centers as 1x1 convolutions to the model
class8 = net.getLayerId(“class8_ab”)
conv8 = net.getLayerId(“conv8_313_rh”)
pts = pts.transpose().reshape(2, 313, 1, 1)
net.getLayer(class8).blobs = [pts.astype(“float32”)]
net.getLayer(conv8).blobs = [np.full([1, 313], 2.606,dtype=”float32")]
# load the input image from disk, scale the pixel intensities to the
# range [0, 1], and then convert the image from the BGR to Lab color
# space
image = cv2.imread(args[“image”])
scaled = image.astype(“float32”) / 255.0
lab = cv2.cvtColor(scaled, cv2.COLOR_BGR2LAB

Now we can pass the input L channel through the network to predict the ab channels:

A forward pass of the L channel through the network takes place on Lines 48 and 49 (here is a refresher on OpenCV’s blobFromImage if you need it).

Notice that after we called net.forward , on the same line, we went ahead and extracted the predicted ab volume. I make it look easy here, but refer to the Zhang et al. documentation and demo on GitHub if you would like more details.

Afterwards, come the post-processing part, which includes :

Post processing includes:

  • Grabbing the L channel from the original input image (Line 58) and concatenating the original L channel and predicted ab channels together forming colorized
  • Converting the colorized image from the Lab color space to RGB
  • Clipping any pixel intensities that fall outside the range [0, 1]
  • Bringing the pixel intensities back into the range [0, 255] During the preprocessing steps we divided by 255 and now we are multiplying by 255 . I’ve also found that this scaling and “uint8” conversion isn’t a requirement but that it helps the code work between OpenCV 3.4.x and 4.x versions.

Finally, both our original image and colorized images are displayed on the screen!

# resize the Lab image to 224x224 (the dimensions the colorization
# network accepts), split channels, extract the ‘L’ channel, and then
# perform mean centering
resized = cv2.resize(lab, (224, 224))
L = cv2.split(resized)[0]
L -= 50
# pass the L channel through the network which will *predict* the ‘a’
# and ‘b’ channel values
‘print(“[INFO] colorizing image…”)’
net.setInput(cv2.dnn.blobFromImage(L))
ab = net.forward()[0, :, :, :].transpose((1, 2, 0))
# resize the predicted ‘ab’ volume to the same dimensions as our
# input image
ab = cv2.resize(ab, (image.shape[1], image.shape[0]))

# grab the ‘L’ channel from the *original* input image (not the
# resized one) and concatenate the original ‘L’ channel with the
# predicted ‘ab’ channels
L = cv2.split(lab)[0]
colorized = np.concatenate((L[:, :, np.newaxis], ab), axis=2)
# convert the output image from the Lab color space to RGB, then
# clip any values that fall outside the range [0, 1]
colorized = cv2.cvtColor(colorized, cv2.COLOR_LAB2BGR)
colorized = np.clip(colorized, 0, 1)
# the current colorized image is represented as a floating point
# data type in the range [0, 1] — let’s convert to an unsigned
# 8-bit integer representation in the range [0, 255]
colorized = (255 * colorized).astype(“uint8”)
# show the original and output colorized images
cv2.imshow(“Original”, image)
cv2.imshow(“Colorized”, colorized)
cv2.waitKey(0)

Image colorization results

Don’t be alarmed. This is the type of result which is to be expected but the results which I got are a little bit off from this type of perfection.

This is an image of Mark Twain, an American writer, humorist, entrepreneur, publisher, and lecturer. He was lauded as the “greatest humorist this country has produced”, and William Faulkner called him “the father of American literature”.

Here we can see that the grass and foliage are correctly colored a shade of green, although you can see these shades of green blending into Twain’s shoes and hands.

On the left, you can see the original input image of Robin Williams, a famous actor and comedian who passed away ~5 years ago.

On the right, you can see the output of the black and white colorization model.

Summary

In today’s tutorial, you learned how to colorize black and white images using OpenCV and Deep Learning.

The image colorization model we used here today was first introduced by Zhang et al. in their 2016 publication, Colorful Image Colorization.

Using this model, we were able to colorize black & white images.

Our results, while not perfect, demonstrated the plausibility of automatically colorizing black and white images and videos.

According to Zhang et al., their approach was able to “fool” humans 32% of the time!