Source: Deep Learning on Medium
Every day the need for machine learning technology is constantly growing. With its help, it became possible to solve many problems related to speech and text recognition, classification and segmentation of objects in images, etc. It is quite natural that this technology found its application in medicine. For example, today there is a problem with diagnosing brain tumors in patients: such situations are frequent when the determination of neoplasms occurs at a late stage of their development, in which treatment of a person is practically impossible.
Magnetic resonance imaging (MRI) is one of the most breakthrough technologies for diagnosing tumors: it receives layered images of brain tissue from different angles with a layer thickness of up to 8 mm, which makes it possible to fairly accurately localize neoplasms even in the early stages of development . Despite the fact that MRI was invented back in the 70s of the last century, today it is still the most popular method for diagnosing brain diseases (Alzheimer’s, epilepsy, cancer, etc.). But manual MRI image analysis is a rather long and inefficient method due to the fast fatigue of people from routine work and, as a result, human errors . Accordingly, if a computer learns to quickly and accurately recognize a brain tumor in the early stages of development by MRI images, it will be possible to significantly reduce the number of medical errors, increase the likelihood of detecting this disease in the early stages of development in more patients and thereby save many lives. In addition to all this, automating the routine work of analyzing MRI images will help to relieve doctors.
There are many different types of malignant formations of the brain. The most common and aggressive form of brain tumors is glioblastoma. Detection and localization of such formations is carried out on the basis of stratified MRI images of the brain. The brain is shot in several modes of tomography. The most common mode is T1 VI. In this mode, fat is shown in white, soft tissues are displayed in shades of gray, and cerebrospinal fluid is displayed in black. Also during the shooting, a special substance can be introduced to the patient — contrast (T1c VI mode). In the images obtained with the use of a contrast medium, the affected areas are highlighted with bright white light, strongly standing out against the background of soft tissues. To localize liquids, use T2 mode. In this mode, fluids and fat are marked in white, which allows you to detect the perifocal edema that forms around the tumor. In order to distinguish perifocal edema from other brain fluids, the FLAIR mode is used.
In the field of pattern recognition in images, convolutional neural networks, which received this name through the use of convolutional layers, achieved the greatest success. The principle of operation of the convolutional layer is quite simple: a weight matrix, called a filter, moves along the input multidimensional data array. The data area to which the filter is applied is called its perception field. When moving the filter, the element-wise multiplication of the matrix by elements of the perception field occurs. Then, the sum of the obtained values is written into the matrix of output values, called a feature map.
Each of the filters is looking for some feature of the recognition object (for example, some color, simple lines, etc.). It is easy to guess that the more filters we use, the more accurate the recognition result. As a result, the output array is less than the original. It is important to note that in a neural network, not one, but several convolutional layers are used. This allows her to recognize complex objects. Here’s how it works: Feature maps, obtained by the first convolutional layer and containing simple features, are processed by the next convolution layer, the filters of which recognize combinations of simple elements (for example, rectangles, which are a combination of straight lines). Subsequent layers work with increasingly sophisticated forms of a recognizable object.
Between convolutional layers, layers with the function of activation and pooling are placed. The most popular activation function is ReLU:
It allows not to take into account in calculations neurons whose output signal is negative. A pooling layer is required to compress several neighboring values to one (for example, several pixels are converted to one), which allows you to speed up the training of the network and avoid retraining (remembering by the network the smallest properties of the object, as a result of which it can recognize them only on the training set). Currently, convolutional neural networks are widely used in the tasks of recognizing tumors in MRI images of the brain, we used the UNet architecture for these purposes. It is based on the use of compression and expansion paths of convolutional layers.
The compression path is based on the classical scheme of neural networks (convolution => ReLU => pooling => convolution => ReLU => pooling => …). The last layer uses 1×1 filters to match the map of abstract signs with the class vector (in our case, the classes will be edema around the tumor, the center of the tumor and the tumor cells in the center). The decompressor uses reverse convolution layers and upscaling layers (the upscaling operation is the opposite of pulling). During reverse convolution, upscaling, as well as combining of feature cards with the card obtained at the corresponding step of the convolution (the connection is necessary to make up for the loss of boundary values at each convolution), a feature map is obtained with specified areas of their location relative to the initial data.
To train the neural network UNet, the BRATS 2017 archive was used . This archive contains images of MRI of the brain of patients in nii.gz format, taken in T1, T1c, T2 and FLAIR modes. The created program uses the TensorFlow deep learning framework and neural network technology to recognize medical images NiftyNet.
The result of the program is recorded as a file of the nii.gz format. To view it, you must open the file loaded into the program and the file containing the result of the work in the medical image viewer.
We used the ITK-SNAP 3.6.0 program. As a result, the implemented program when processing an MRI image of the patient’s brain in nii.gz format produces the result, which is displayed by the ITK-SNAP program in the form of an image of his brain in three projections (axial, sagittal and frontal) with highlighted areas of the edema, the center of the tumor and living tumor cells .