Quick intro to Intel’s OpenVINO toolkit for faster Deep Learning inference

Original article was published on Deep Learning on Medium

What is OpenVINO

Intel’s Open Visual Inference and Neural network Optimization (OpenVINO) toolkit enables the vision application (or any other DNN) to run faster on Intel’s Hardware/ Processors.

The OpenVINO™ toolkit is a comprehensive toolkit for quickly developing applications and solutions that emulate human vision. Based on Convolutional Neural Networks (CNNs), the toolkit extends CV workloads across Intel® hardware, maximizing performance.

To put it straight, an attempt by Intel to sell their underlying processors (CPUs, iGPUs, VPUs, Gaussian & Neural Accelerators, and FPGAs) for making your AI (vision) Applications run faster.

This is not a toolkit for faster training of your Deep Learning task, but for faster inference of your already trained Deep Neural model.

The tool main components of OpenVINO toolkit are

  1. Model Optimizer
  2. Inference Engine

For complete details on the toolkit, check this.

Intel OpenVINO, Source

The task of the Model optimizer (a .py file for individual frameworks) is to take a already trained DL model and adjust it for optimal execution for the target device. The output of Model Optimizer is the Intermediate Representation (IR) of your DL Model.

The IR is set of 2 file which describes the Optimized version of you DL models

  1. *.xml – Describes the network topology
  2. *.bin – Contains the weights (& biases) binary data

The output of Model Optimizer (IR files), is what we must pass to Inference Engine, which runs it on the hardware.

So to make best use of your Intel Hardware for your ML (Edge, cloud …) Applications running in Production, all you is to do is, make the IR files of your DL model and run it through the Inference Engine rather than directly running it over the hardware.