Introduction to OpenVINO toolkit for deploying AI models on the edge (part 1)

Source: Deep Learning on Medium

Introduction to OpenVINO toolkit for deploying AI models on the edge (part 1)

The need for AI on the edge

Data is generated at the edge, then it is sent to the cloud for processing, and insights come back to the edge.

There is a need to deploy AI at the edge for a number of reasons, such as:

1. The need for low latency

2. No network available to send the data/network impact concerns

3. The need for real-time decision making

4. The need for more security for the data

OpenVINO toolkit

Open Visual Inferencing and Neural Network optimization toolkit is an open-source project developed by Intel. It takes model developed and trained with DL frameworks such as Pytorch, Tensorflow, MXNet & Kaldi and optimizes them for inference. The optimized model is then used with the inference engine which optimizes it more according to the target hardware for deployment.

The toolkit optimizes AI models for various hardware, such as Intel CPUs, GPUs, FPGAs (Field Programmable Gate Arrays), and VPUs (Vision Processing Units).

In addition to the model optimizer and the inference engine, OpenVINO has a model zoo that contains a variety of pre-trained models.

Model zoo

Categories:

1- Face and human-related detection and recognition

2- Vehicle-related detection and recognition

3- Text detection

4- Some more

There are two main sets in the model zoo:

  • Public sets, which are pre-trained models that are not optimized yet so you can fine-tune and retrain them according to your specific use.
  • Free sets, which are already optimized models ready to be processed directly by the model optimizer. Unlike the public sets, these models can not be retrained.

All models information is available at https://docs.openvinotoolkit.org/latest/_models_intel_index.html

Models are downloaded manually or using the Model Downloader script in the toolkit. It is found in
(<OPENVINO_INSTALL_DIR>/deployment_tools/open_model_zoo/tools/downloader)

After a model is optimized, it is in an Intermediate Representation (IR) which is fed to the inference engine for further optimization, then it is integrated to the edge application