Understanding DICOMs

Original article was published on Artificial Intelligence on Medium

An In Depth Hands On approach to how to view and manipulate DICOM images using fastai2’s medical imaging module and get them ready for machine learning.

Fastai2 will officially be released in July 2020

Image produced using pixel selection and a `ocean_r` colormap

What are DICOMs?

DICOM(Digital Imaging and COmmunications in Medicine) is the de-facto standard that establishes rules that allow medical images(X-Ray, MRI, CT) and associated information to be exchanged between imaging equipment from different vendors, computers, and hospitals. The DICOM format provides a suitable means that meets health information exchange (HIE) standards for transmission of health related data among facilities and HL7 standards which is the messaging standard that enables clinical applications to exchange data.

Typical Radiology Workflow [Image Credit]

DICOM files typically have a .dcm extension and provides a means of storing data in separate ‘tags’ such as patient information, image/pixel data, the machine used and alot more information (explained below).

A DICOM file predominantly consists of a header and image pixel intensity data packed into a single file. The information within the header is organized as a standardized series of tags. By extracting data from these tags one can access important information regarding the patient demographics, study parameters and a lot more.

Parts of a DICOM [Image Credit]

16 bit DICOM images have values ranging from -32768 to 32768 while 8-bit grey-scale images store values from 0 to 255. The value ranges in DICOM images are useful as they correlate with the Hounsfield Scale which is a quantitative scale for describing radio-density (or a way of viewing different tissues densities — more explained below)

Installation Requirements

These are the dependencies you will need to have installed on your computer to be able to go through this tutorial

fastai2 which will officially be released July 2020, installation instructions can be viewed on their Github page: fastai2

Also requires installing pycidom (Pydicom is a python package for parsing DICOM files and makes it easy to covert DICOM files into pythonic structures for easier manipulation.

and scikit-image (is a collection of algorithms for image processing)

and kornia (is a library of packages containing operators that can be inserted within neural networks to train models to perform image transformations, epipolar geometry, depth estimation, and low-level image processing such as filtering and edge detection that operate directly on tensors

For more information on how to use fastai2’s medical imaging module head over to my github page or my tutorial blog on medical imaging(which is better for viewing notebook tutorials :))


Here is a list of 3 DICOM datasets that you can play around with. Each of these 3 datasets have different attributes and shows the vast diversity of what information can be contained within different DICOM datasets.

  • the SIIM_SMALL dataset ((250 DICOM files, ~30MB) is conveniently provided in the fastai library but is limited in some of its attributes, for example, it does not have RescaleIntercept or RescaleSlope and its pixel range is limited in the range of 0 and 255
  • Kaggle has an easily accessible (437MB) CT medical image dataset from the cancer imaging archive. The dataset consists of 100 images (512px by 512px) with pixel ranges from -2000 to +2000
  • The Thyroid Segmentation in Ultrasonography Dataset provides low quality (ranging from 253px by 253px) DICOM images where each DICOM image has multiple frames (average of 1000)

Let’s load the dependencies:

#Load the dependancies
from fastai2.basics import *
from fastai2.callback.all import *
from fastai2.vision.all import *
from fastai2.medical.imaging import *
import pydicom
import seaborn as sns
matplotlib.rcParams['image.cmap'] = 'bone'
from matplotlib.colors import ListedColormap, LinearSegmentedColormap

Having some knowledge about fastai2 is required and beyond the scope of this tutorial, the fastai2 docs page has some excellent tutorials to get you started quickly.

….a bit more about how data is stored in DICOM files

DICOM files are opened using pydicom.dcmread for example using the SIMM_SMALL dataset:

#get dicom files
items = get_dicom_files(pneumothorax_source, recurse=True, folders='train')
#now lets read a file:
img = items[10]
dimg = dcmread(img)

You can now view all the information contained within the DICOM file. Explanation of each element is beyond the scope of this tutorial but this site has some excellent information about each of the entries. Information is listed by the DICOM tag (eg: 0008, 0005) or DICOM keyword (eg: Specific Character Set).

So for example:


Now generates:

‘head’ information created from `dimg`

Here are some key points on the tag information above:

  • Pixel Data (7fe0 0010)(last entry) — This is where the raw pixel data is stored. The order of pixels encoded for each image plane is left to right, top to bottom, i.e., the upper left pixel (labeled 1,1) is encoded first
  • Photometric Interpretation (0028, 0004) — aka color space. In this case it is MONOCHROME2 where pixel data is represented as a single monochrome image plane where the minimum sample value is intended to be displayed as black info
  • Samples per Pixel (0028, 0002) — This should be 1 as this image is monochrome. This value would be 3 if the color space was RGB for example
  • Bits Stored (0028 0101) — Number of bits stored for each pixel sample
  • Pixel Represenation (0028 0103) — can either be unsigned(0) or signed(1)
  • Lossy Image Compression (0028 2110) — 00 image has not been subjected to lossy compression. 01 image has been subjected to lossy compression.
  • Lossy Image Compression Method (0028 2114) — states the type of lossy compression used (in this case JPEG Lossy Compression, as denotated by CS:ISO_10918_1)

Important tags that are not included in the SIMM_SMALL dataset:

  • Rescale Intercept (0028, 1052) — where the value b in relationship between stored values (SV) and the output units. Output units = m*SV + b.
  • Rescale Slope (0028, 1053) — m in the equation specified by Rescale Intercept (0028,1052).

The RescaleIntercept and RescaleSlope are applied to transform the pixel values of the image into values that are meaningful to the application. Calculating the new values usually follow a linear formula:

  • NewValue = (RawPixelValue * RescaleSlope) + RescaleIntercept

and when the relationship is not linear a LUT(LookUp Table) is utilized.

All this will become clearer when we view another dataset later below

The list above is limited and playing around with the other 2 datasets will show you that there can be a vast number of tags per file. Its worth pointing out that there may be additional information (however not always the case) where there may be a tag for ImageComments. This tag may contain information that may be useful in the modelling.

… what about that Pixel Data?

By default pydicom reads pixel data as the raw bytes found in the file and typically PixelData is often not immediately useful as data may be stored in a variety of different ways:

  • The pixel values may be signed or unsigned integers, or floats
  • There may be multiple image frames
  • There may be multiple planes per frame (i.e. RGB) and the order of the pixels may be different These are only a few examples and more information can be found on the pycidom website

This is what PixelData looks like:


Because of the complexity in interpreting PixelData, pydicom provides an easy way to get it in a convenient form: pixel_array which returns a numpy.ndarray containing the pixel data:

dimg.pixel_array, dimg.pixel_array.shape

Loading DICOMs which have 1 frame per file

The SIIM_SMALL dataset is a DICOM dataset where each DICOM file has a pixel_array that contains 1 image. In this case the show function within fastai.medical.imaging conveniently displays the image

source = untar_data(URLs.SIIM_SMALL)
items = get_dicom_files(source)
patient1 = dcmread(items[0])
patient1.show() will display the image above

How about loading an image from the CT medical image dataset which also contains 1 frame per DICOM file. This image is a slice of a CT scan looking at the lungs with the heart in the middle.

csource = Path('C:/PillView/NIH/data/dicoms')
citems = get_dicom_files(csource)
patient2 = dcmread(citems[0])

However what if a DICOM dataset has multiple frames per file?

Loading DICOMs which have multiple frames per file

The Thyroid Segmentation in Ultrasonography Dataset is a dataset where each DICOM file has multiple frames per file.

Out of the box you cannot view images from DICOM files with multiple frames with fasta2 as this will result in a TypeError but we can customize theshow function so that it will check to see if a file has more than 1 frame and if it does you can choose how many frames to view (default is 1) as well as print how many frames there are in each file.

#updating to handle multiple frames
@delegates(show_image, show_images)
def show(self:DcmDataset, frames=1, scale=True, cmap=plt.cm.bone, min_px=-1100, max_px=None, **kwargs):
px = (self.windowed(*scale) if isinstance(scale,tuple)
else self.hist_scaled(min_px=min_px,max_px=max_px,brks=scale) if isinstance(scale,(ndarray,Tensor))
else self.hist_scaled(min_px=min_px,max_px=max_px) if scale
else self.scaled_px)
if px.ndim > 2:
p = px.shape; print(f'{p[0]} frames per file')
for i in range(frames): u = px[i]; gh.append(u)
show_images(gh, cmap=cmap, **kwargs)
print('1 frame per file')
show_image(px, cmap=cmap, **kwargs)

Fastai2 has a very intuitive way where you can patch code hence the @patch above so you can easily add different functionality to the code

So now we can load the thyroid dataset:

tsource = Path('C:/PillView/NIH/data/thyroid')
titems = get_dicom_files(tsource)
patient3 = dcmread(titems[0])
patient3.show(10). We specified 10 so it shows the first 10 frames out of a total of 932 frames in this 1 file

Understanding Tissue Densities

Using the CT medical image dataset we can now play around with other useful fastai.medical.imaging functionality. As mentioned before 16 bit DICOM images can have pixel values ranging from -32768 to 32768

patient2 above is a DICOM file from this dataset.

#lets convert the pixel_array into a tensor. Fastai2 can #conveniently do this for us
tensor_dicom = pixels(patient2) #convert into tensor
print(f'RescaleIntercept: {patient2.RescaleIntercept:1f}\nRescaleSlope: {patient2.RescaleSlope:1f}\nMax pixel: '
f'{tensor_dicom.max()}\nMin pixel: {tensor_dicom.min()}\nShape: {tensor_dicom.shape}')
RescaleIntercept, RescaleSlope, Max and Min pixel values

In this image the RescaleIntercept is -1024, the RescaleSlope is 1, the max and min pixels are 1918 and 0 respectively and the image size is 512 by 512

Plotting a histogram of pixel intensities you can see where the bulk of pixels are located

plt.hist(tensor_dicom.flatten(), color='c')
Histogram of pixel values

The histogram shows that the minimal pixel value is 0 and the maximum pixel value is 1918. The histogram is predominantly bi-modal with the majority of pixels between the 0 and 100 pixels and between 750 and 1100 pixels.

This image has a RescaleIntercept of -1024 and a RescaleSlope of 1. These two values allows for transforming pixel values into Hounsfield Units(HU). Densities of different tissues on CT scans are measured in HUs

Most CT scans range from -1000HUs to +1000HUs where water is 0HUs, air is -1000HUs and the denser the tissues the higher the HU value. Metals have a much higher HU range +2000HUs so for medical imaging a range of -1000 to +1000HUs is suitable

The pixel values above do not correctly correspond to tissue densities. For example most of the pixels are between pixel values 0 and 100 which correspond to water but this image is predominantly showing the lungs which are filled with air. Air on the Hounsfield scale is -1000 HUs.

This is where RescaleIntercept and RescaleSlope are important. Fastai2 provides a convenient way scaled_px to rescale the pixels with respect to RescaleIntercept and RescaleSlope.

Remember that:

rescaled pixel = pixel * RescaleSlope + RescaleIntercept

Fastai2 again provides a covenient method scaled_px of converting a pixel_array into a tensor and scaling the values by taking RescaleIntercept and RescaleSlope into consideration.

#convert into tensor taking RescaleIntercept and RescaleSlope into #consideration
tensor_dicom_scaled = scaled_px(patient2)
plt.hist(tensor_dicom_scaled.flatten(), color='c')
Scaled pixel histogram using `scaled_px`

Now lets look at the maximum and minimum pixel values:

print(f'Max pixel: {tensor_dicom_scaled.max()}\nMin pixel: {tensor_dicom_scaled.min()}')

After re-scaling the maximum pixel value is 894 and the minimum value is -1024 and we can now correctly see what parts of the image correspond to what parts of the body based on the Hounsfield scale.

Looking at the top end of the histogram what does the image look like with values over 300 HUs?

The show function has the capability of specifying max and min values

patient2.show(max_px=894, min_px=300, figsize=(5,5))
Viewing pixel values between 300px and 894px

Recall we had patched the show function and it now shows that this dataset has 1 frame per file. HU values above +300 typically will show the bone structures within the image

What about within the range of -250 and 250 where we have a spike to pixels?

patient2.show(max_px=250, min_px=-250, figsize=(5,5))
Pixel range between -250px and +250px

Within this range you can now see the aorta and the parts of the heart(image middle) as well as muscle and fat.

What about between -250px and -600px where we notice there are a low distribution of pixels

patient2.show(max_px=-250, min_px=-600, figsize=(5,5))

In this range you just make out outlines. The histogram does show that within this range there are not many pixels

What about between -1000px and -600px?

patient2.show(max_px=-600, min_px=-1000, figsize=(5,5))

Within this range you can clearly see the bronchi within the lungs

patient2.show(max_px=-900, min_px=-1024, figsize=(5,5))

At this range you can now also clearly see the curve of the scanner.

The show function by default has a max_px value of None and a min_px value of -1100

patient2.show(max_px=None, min_px=-1100, figsize=(5,5))

Image re-scaling as done above is really for the benefit of humans. Computer screens can display about 256 shades of grey and the human eye is only capable of detecting about a 6% change in greyscale (ref) meaning the human eye can only detect about 17 different shades of grey.

DICOM images may have a wide range from -1000 to +1000 and for humans to be able to see relevant structures within the image a process of windowing is utilized. Fastai2 provides various dicom_windows so that only specific HU values are displayed on the screen. More about windowing can be found here

….side note on Windowing

DICOM images can contain a high amount of pixel values and windowing can be thought of as a means of manipulating these values in order to change the appearance of the image so particular structures are highlighted. A window has 2 values:

l = window level or center aka brightness

w = window width or range aka contrast

Example: from here

Brain Matter window

l = 40 (window center) w = 80 (window width)

Voxels displayed range from 0 to 80

Calculating voxel values:

  • lowest_visible_value = window_center — window_width / 2
  • highest_visible_value = window_center + window_width / 2

(lowest_visible_value = 40 — (80/2), highest_visible_value = 40 + (80/2))

Hence all values above >80 will be white and all values below 0 are black.

….but does a computer really care about Windowing, RescaleIntercept , RescaleSlope?

Where windowing is for the benefit of the human, computers produce better results from training when the data has a uniform distribution as mentioned in this article don’t see like a radiologist

Looking back at the pixel distribution we can see that the image does not have a uniform distribution

Histogram of rescaled pixels

fastai2 has a function freqhist_hist that splits the range of pixel values into groups depending on what value you set for n_bins, such that each group has around the same number of pixels.

For example if you set n_bins to 1, the pixel values are split into 2 distinct pixel bins.

ten_freq = tensor_dicom_scaled.freqhist_bins(n_bins=1)
fh = patient2.hist_scaled(ten_freq)
plt.hist(ten_freq.flatten(), color='c'); show_image(fh, figsize=(7,7))
Splitting the pixels into 2 distinct bins

In this case you see the 2 polar sides of the image at -1000HUs you see the air portions and at 500HUs you see the bone structures clearly but the distribution is still not fully acceptable for the machine learning model.

with n_bins at 100(this is the default number used by show)

Pixels distributed within 100 bins

What about with n_bins at 100000 the pixels are showing a more uniform distribution

Pixels distributed over 100000 bins

What effect does this have on training outcomes. That will be the topic of the next tutorial.


  1. https://www.himss.org/interoperability-and-health-information-exchange
  2. https://www.hl7.org/implement/standards/
  3. https://en.wikipedia.org/wiki/Hounsfield_scale
  4. https://github.com/fastai/fastai2
  5. https://www.kaggle.com/jhoward/don-t-see-like-a-radiologist-fastai/data
  6. https://www.kaggle.com/dcstang/see-like-a-radiologist-with-systematic-windowing