Source: Deep Learning on Medium
Making Deep Learning models ready for the worst-case scenario and cross-platform ready with OpenVINO toolkit.
As 2020 just arrived, the community of the deep learning experts and enthusiasts are looking forth to a significant year of innovation in the field. With a mounted figure of deep learning models being built every day around the world, the dependencies of the humankind on the Cloud and Network(especially TCP ) is expanding day-by-day. You might be considering, well now what’s wrong with Cloud dependencies?
Reckon you have a face detection lock at your home which is improperly built, as the developer installed the model on the cloud and the device has to use the Cloud Services for inferencing. Now, suddenly a day comes when you are facing a remarkably terrible network connection and without any security-override method configured you will become the victim of your security system.
Another real-world instance of such a scenario is the story of a renowned multispeciality hospital located in Bhuvneshwar, Odisha, India. They drilled to have a deep learning network, properly trained and tuned with the domain expertise, but it was implemented in such a way that it had to send the heart-rate of the patient every second as a stream to a web server over TCP to determine the myocardial infarction. After a devastating cyclone hit the coastal Odisha, the system was of no use because there was no cellular connection at all.
If proper steps are not taken to deploy the deep learning models which are required to make critical decisions at any moment the model can face the worst ordeal. With the rapid headways of Deep Learning models in critical decision-making operations, if not configured with the edge case in mind, it can face identical buffeting circumstances. Immense problems can happen if security surveillance or health-care systems fail all of a sudden.
To make these models shielded from these concerns we need to implement these models in such a method that the models can perform real-time decisions without attaching to any other cloud services or the internet. This method is proved to be more secured as the deployed model is outside the reach of the Internet and thus workloads those require a maximum level of security can be implemented directly in the device. Enthusiasts call these AI models Edge AI. In this scheme, the model is directly placed in the device and they need no network connection for inferencing. We will now get to know how this is achieved.
The models we build and train using different frameworks such as Tensorflow, Caffe, Pytorch, ONNX etc. can be substantially large, resource-hungry and can also be architecture-dependent such as constrained to a specific platform or CPU/GPU kernels. To make these models be able to successfully provide inference from any device or from anywhere we need to convert the model in Intermediate Representation format, which includes the schema of the model in .xml format and the weights and biases of the model in .bin format.
Obtaining and transforming different models into IR format using OpenVINO toolkit:
The OpenVino Toolkit (Open Visual Inference and Neural network Optimization toolkit) is an open-source deep-learning toolkit originally developed by the OpenCV team includes different tools to convert different deep learning models into IR format using the model optimizer tool. In the process of converting models made out of different frameworks the model optimizer tool simply works as a translator which actually just translates the frequently used deep learning operations such as for Tensorflow we see, Conv2D, Conv3D, Dropout, Dense, BatchNormalization, etc., for Caffe we use convolution, dropout_layer, etc. to their respective similar representation in the OpenVino toolkit and tunes them with the associated weights and biases from the trained model. The Intel Distribution of OpenVINO toolkit has quite a large collection of different pre-trained models available on the following website which you can deploy to different devices. These pre-trained model can be directly downloaded by the model downloader tool. The pre-trained models downloaded using the model downloader tool already comes in the Intermediate representation format with different precision levels. These precision levels actually are the precision levels of the saved weights and biases of the model. Different precision levels include FP32(32 bit Floating Point), FP16(16-bit Floating-Point), INT16(16-bit Integer), INT8(8-bit Integer, available for only the pre-trained models) and many more. These precision levels are actually important because of ease of deployment into different platforms. Less precision makes less accurate results but the model takes much fewer resources to run, making it available for full deployment into edge devices without substantially hampering the performance of both the device and the model. Let us take a look at how we can use the model downloader to download pre-trained models from the Intel OpenVINO toolkit’s website and how to use them to get the inference on a given input.
The following is the link of the pre-trained model containing the documentation of preprocessing the inputs before feeding into the model.
Provided that the OpenVINO toolkit is installed and properly configured on your local machine let’s jump right into the procedure of downloading the above model. Head over to your OpenVINO installation directory and open the Terminal or Command Prompt with Administrator privileges. Now to download the above model issue the following command:
python C:/<OPENVINO_INSTALLATION_DIRECTORY>/openvino/deployment_tools/tools/model_downloader/downloader.py --name vehicle-attributes-recognition-barrier-0039 --progress_format=json --precisions FP16,INT8 -o \Users\<USER_ID>\Desktop
The above command uses the downloader.py python program which parses the command line arguments :
- — name: for providing the model name( if in place of — name, “ — all” is provided all the available pre-trained models will be downloaded),
- — precisions: for providing different precision levels (if nothing provided, all available precision level of the model will be downloaded)
- — progress_format=json: makes the format of the progress report in JSON format, which can be analysed by the program.