Image Segmentation Techniques using Digital Image Processing, Machine Learning and Deep Learning…

Original article was published on Deep Learning on Medium

Table of Contents:

  1. What is digital image processing and its components?
  2. A brief introduction to different image segmentation methods using DIP.
  3. What are the latest and efficient DIP methods used for image segmentation.
  4. Advantages and disadvantages of using DIP image segmentation methods.

What is digital image processing?

Referring to one of the most famous book Digital Image Processing by Rafael c. Gonzalez, Digital Image Processing means processing a digital image by means of a digital computer in order to get enhanced image either to extract some useful information. Image segmentation is one of the phase/sub-category of DIP.

Image processing mainly include the following steps:

  1. Importing the image via image acquisition tools.
  2. Analysing and manipulating the image to get a desired image (segmented image in our case) and
  3. To have an output image or a report which is based on analysing that image.

Components of Digital Image Processing System:

  1. Image Acquisition– It is the phase in which an analogue image is converted into digital image. This process usually occur when we click a photo from a digital camera as in reality image is a analogue signal if captured by human visual system.
  2. Image Enhancement – It is the phase which is used to alter the image pixel values so that it can be nicely perceived by HVS. This can be done by either using the spatial domain or in frequency domain. Examples: histogram equalisation, noise reduction, deblurring, sharpening and softening the images, filtering, etc.
  3. Colour Space Conversion– It deals with converting the colour space of the image in which it can be more precisely represented for extracting features of interest in an image. Some examples of colour space are cieLAB, HSV, HSL, etc.
  4. Digital Image Transformation — It deals with representing the image into different format so that the transformed image can be used for tasks like image compression, feature extraction, etc. These transformation include DFT, discrete cosine transform, discrete wavelet transform, representing the images into eigen vectors and eigen spaces aka PCA.
  5. Image Compression — It deals with the techniques for reducing the storage required to save an image or bandwidth required for transmitting it. It consist of various encoding techniques for example run-length encoding, EBCOT, lossless and lossy predictive coding, etc.
  6. Morphological Image Processing — It deals with tools for extracting image components that are useful in the representation & description of shape. Examples: Dilation, erosion, boundary extraction, region filling, opening and closing, etc.
  7. Image Segmentation — It includes dividing an image into its constituent parts or objects. Examples: edge detection, boundary detection, thresholding, region based segmentation, etc.
  8. Image Descriptors —
  9. Object Recognition — It is a process that assigns a label to an object based on its descriptor.

Now we have a basis understanding of DIP and its component we can dive into its one of the component that is image segmentation.

A brief introduction to different image segmentation methods using DIP.

In this section we will learn how to segment an image only using image processing techniques and nothing else i.e machine learning or deep learning. Some of the techniques which we will discuss were proposed as late as early nineties and hence this make us believe that image segmentation is not a new concepts but an old one before the popularity of machine learning.

Below are the methods to segment an image using DIP:

  1. Threshold based segmentation: This is the simplest method of image segmentation where each pixel value is compared with the threshold value. If the pixel value is smaller than the threshold, it is set to 0, otherwise, it is set to a maximum value (generally 255).This threshold value which can be changed arbitrarily. The application of this algorithms is when we have to separate foreground with background.The drawback of this algorithm is that it will always segment the image in to two categories.
The three thresholding methods i.e. 1. Global Thresholding when a a single threshold value is used in the whole image, 2. Adaptive Mean Thresholding where the threshold value is the mean of neighbourhood area of size s which can be set manually and 3. Adaptive Gaussian Thresholding where the threshold value is the weighted sum of neighbourhood values where weights are a gaussian window.

2. Edge based segmentation: With this technique, detected edges in an image are assumed to represent object boundaries, and are used to identify these objects. Sobel and canny edge detection algorithms are some of the examples of edge based segmentation techniques.

Canny edge detection image segmentation

3. Morphological methods based segmentation: It is the methodology for analysing the geometric structure inherent within an image. In this technique the output image pixel values are based on similar pixels of input image with is neighbours and produces a new binary image. This method is also used in foreground background separation.

The base of the morphological operation is dilation, erosion, opening, closing expressed in logical AND, OR. This technique is mainly used in shape analysis and noise removal after thresholding an image. Example: watershed algorithm.

1. Original Image 2. Thresholded image where foreground contains some noises 3. Clean noise free image after performing closing and dilation operations on image 2.

4. Graph based segmentation techniques: Graph-based approaches treat each pixel as a node in a graph. Edge weights between two nodes are proportional to the similarity between neighbouring pixels. Pixels are grouped together to form segments or a.k.a superpixels by minimising a cost function defined over the graph.

The gray nodes in the network denotes the pixels and the edges are the neighbours of these pixels. The whole image is seen as a un-directed graph structure and aim is to divide this graph in to segments like red and green regions shown in the left image. The right image is the adjacency matrix which we can form out of the graph network.

Some of the popular graph based image segmentation techniques are normalised cut by J. Malik et. al, graph cut proposed by Veksler et. al, Efficient Graph-Based Image Segmentation by P. Felzenswalb et. al.

Implemented graph based image segmentation methods.

5. Clustering based segmentation techniques: Starting from a rough initial clustering of pixels, gradient ascent methods iteratively refine the clusters until some convergence criterion is met to form image segments or superpixels. These type of algorithms aim to minimise the distance between the cluster centre and each pixel in the image. This distance is defined differently for each algorithm but is dependent on either spatial distance between the pixel and the centre, colour distance between each pixel and the centre or both.

Clustering of data points where the solid data point is the cluster centre for each cluster.

Some of the popular clustering based image segmentation techniques are k-Means clustering, watershed algorithm, quick shift, SLIC, etc.

Implemented clustering based image segmentation methods.

6. Probabilistic image segmentation technique: In theory there are two types of clustering based segmentation, one is soft clustering and the other is hard clustering. In hard clustering which is discussed in point 5 above, each pixel will be assigned to either of the cluster(either cluster 1,2, or k). whereas in soft clustering, each pixel or datapoint will be classified in to every cluster with a probability. Hence soft clustering is a probabilistic type of clustering. Soft clustering helps in those situations when there is an overlap between the clusters and hence the data points/pixels in the overlap region have some probability to be assigned to both of the clusters.

An example of soft and hard clustering techniques.

Gaussian mixture model is one of the soft clustering technique which can be used for image segmentation.

The left image is the original image and right is the GMM segmented image with k=6.

Latest and efficient DIP methods used for image segmentation

In real world applications, image segmentation algorithms are expected to segment a large number of diverse images. These images can be of different contrast, angles, cropped and intensity. So, in order to fulfil the expectation and provide a high accurate segmentation we need to select those methods which are not sensitive to all these changes. The combination of multiple segmentation methods allows us to tackle the problem of the diversity and uncertainty of the image, it is necessary to combine the multiple segmentation methods and make full use of the advantages of different algorithms on the basis of multi-feature fusion, so as to achieve better segmentation.

Moreover, clustering techniques, both soft and hard depend on the problem statement, are used extensively due to their high computational efficiency and better results.

Advantages and disadvantages of using DIP image segmentation methods

Advantages: The advantages of using these methods are that they are simple and efficient in case of clustering algorithms, theoretically derived (mathematically) in case of other segmentation methods which is not in the case of CNN or DL methods. In theoretically derived methods we can easily see the hidden details and what features are contributing to the outcome we are getting or in other words these methods are able to answer the question why we are getting this output? which is not answerable in CNN or DL methods yet.

Disadvantages: It has been seen that applying DIP methods to a particular kind of data set do not generalise well to another similar kind of data set. For example if we apply and build image segmentation pipeline to segment Indian clothes out of a person then the same pipeline may not work to segment African or American peoples’ clothes. This is due to the fact that selection and implementation of the DIP methods are highly customised according to the target data set and no parameter learning is done as in case of ML and DL.


In this blog post we have discussed what is digital image processing and how can we implement image segmentation using DIP methods. Further we have discussed what are the different methods of image segmentation and what are the advantages and disadvantages of DIP image segmentation methods. In my next post I will be discussing the image segmentation techniques using machine learning and DIP which produces much more accurate results and often generalises well.


I would like to thanks my DIP course instructor Prof. Neelam Sinha, IIIT Bangalore for teaching me DIP course and imparting valuable knowledge.


All the codes of the implemented algorithms shown in this blog is present on this link.