Introduction to Intel OpenVINO in 5 minutes

Source: Deep Learning on Medium

Introduction to Intel OpenVINO in 5 minutes

Intel Open VINO (Open Visual Inferencing and Neural Network Optimisation) is a collection of tools along with an inference engine enabling developers to deploy Computer Vision deep learning models at the edge. It also includes optimised OpenCV and OpenVX calls.

OpenVINO Architecture and Workflow

How it works

OpenVINO optimises and converts already-trained models into an Intermediate Representation (IR). This IR can then executed on lower end processors or accelerators at the edge.

Edge refers to local or near-local computation, meaning that you don’t send your data to the cloud.

Model Optimisation and Conversion

All the popular model saving formats are supported:

  • .caffemodel (Caffe)
  • .pb (TensorFlow)
  • .params (MXNet)
  • .nnet (Kaldi)
  • .onnx (PyTorch and Apple ML)

Converting a pre-trained model is pretty straightforward. All you need to do is install the prerequisites for the ones you want to convert and then just convert it using a 1-line shell command.

The most straightforward format to convert would be ONNX, without requiring any extra parameters and the most complex would be the tensorflow models, requiring some extra CLI parameters in order to freeze layer weights and to strip information only needed for training.

The converter supports different optimization options, including specified conversions to reduced float or integer precision, combined layer operations, flipping BGR to RGB and vice-versa and more.

The output of this command is a .xml model topology descriptor and the .bin file which is the model IR.

There are a lot of pre-trained models you can already use without converting at the model zoo. There is an option to download them upon the SDK installation, from the website, or by just using another command line tool, including but not limited to:

  • Detection
  • Segmentation
  • Recognition
  • Pose Estimation
  • OCR

The model converter has got some model architecture limitations. Here is the full list of supported layers. The supported layers may vary regarding to specific hardware accelerators too, but there are custom layers to tackle most of these problems. You might not be required to use this part of the toolkit at all. The supported layer list is huge and should be sufficient for 99% of any computer vision inference model.

The Inference Engine

The inference engine is the part that performs the efficient forward pass of the model. It can be instructed to run on the CPU, on Accelerators and also with different instruction extensions loaded with an .so file.

It supports most of the intel devices, including but not limited to:

  • CPUs
  • GPUs
  • FPGAs
  • VPUs (like the neural compute stick)

There may be some limitations regarding numerical precision.

In general, the inference engine is built with C++ for maximum performance, but natively supports a python frontend for maximum productivity as well.

Setup the IECore python wrapper and plugin in a IENetwork to it, while also choosing device and extra instruction extensions, if necessary. The core supports querying device capabilities as well.

Then, send synchronous or asynchronous requests to the engine and proccess the output, the same way you would, by invoking the model with any other popular libraries, like tensorflow or pytorch.

Deploying an Edge App using the toolkit

Be sure to take a look at the samples, use any pre-trained model from the zoo in order to avoid the hassle of converting a model to the IR format. Leverage OpenCV goodies as much as you can and enjoy the process of creating something new!