let’s see !! why do AI enthusiast fall in love with OpenCV…!
Over the last couple of years, deep learning has been one of the fastest growing areas within artificial intelligence. It has achieved remarkable results, particularly in the area of computer vision, for example, self-driving cars and allowing computers to recognize objects with near human-like ability. OpenCV is very highly rated because it includes state of the art computer vision and machine learning algorithms. When deep learning technology is deployed in machines and IoT devices, you run pre-trained deep learning models. This seems like a marriage made in heaven, world class computer vision software and the ability to run deep learning models, all on inexpensive hardware.
OpenCV is an open source computer vision and machine learning software library. It’s probably the most popular computer vision software out there. The library has more than 2,500 optimized algorithms, which includes classic and state-of-the-art computer vision and machine learning algorithms. These algorithms can be used to detect and recognize faces, identify objects, classify human actions in videos, track camera movements and moving objects amongst many others. OpenCV is written natively in C++. You can also use a Python wrapper for OpenCV. OpenCV also has interfaces to Java and MATLAB and is supported on Windows, Linux, Android, and Mac OS.
…………Magical Deep learning for OpenCV……………
OpenCV’s deep learning module is known as DNN. It’s important to understand that the DNN model is not a full-fledged deep learning framework. We cannot train any deep learning network. There is no back propagation and so no learning that takes place. So we can take an input data, pass it through a previously trained deep neural network model and output the result. This is known as inference. In deep learning terminology, this means that only a forward pass takes place.
Now if you’ve only got a forward pass this makes the code more simple. The installation and assembly of the deep learning network is quicker and is fast enough on CPUs. OpenCV for the DNN module supports Caffe, TensorFlow, Torch, Darknet, and models in ONNX format. As OpenCV’s deep neural network implementation is not tied to one framework, you don’t have the limitations of that framework.
The other advantage you have is that as this is an internal representation of the models, there are ways for the OpenCV developers to optimize and speed up the code. With OpenCV having implemented it’s own deep learning implementation, this has reduced external dependencies to a minimum. A simple inference engine will simply pass the input data through the network and output the result. However, there are a lot of optimizations that can be performed that make the inference speed fast. For example, an efficient inference engine could do things like prune part of the neural network that isn’t activated or combine multiple layers into a single computational step. If the hardware supports 16-bit floating-point operations, which is usually twice as fast as a 32-bit version, an inference engine can utilize this to speed up processing with little or no loss in accuracy. Now in the world of IoT and edge devices, most of the inference in the world is done on CPUs. You won’t put a GPU that cost a few hundred dollars into your surveillance camera. This is what makes OpensCV’s deep learning module a good fit. You just run a deep learning model of choice as an inference engine.
Intel have invested heavily in this and released the OpenVINO toolkit. OpenVINO, or Open Visual Inferencing and Neural network Optimization, is designed to speed up neural networks for tasks, like image classification and object detection.
So what’s happening under the hood?
When the models are loaded, they are converted into an internal representation in OpenCV that’s very similar to Caffe. If we head over to the OpenCV website, we can see that there are several basic neural network layers that are supported. So you see that you’ve got the convolution and deconvolution. You’ve got the pooling layers. You’ve got activation functions, such as Tanh, ReLU, Sigmoid, and Softmax, and functions such as Reshape, Flatten, Slice, and Split. In Open CV’s Deep Learning wiki, you can see that there’s also support for well-known neural network architectures, such as AlexNet, GoogleNet, VGG, and ResNet, amongst others. The DNN module has models available image classification, object detection, and semantic segmentation, amongst others.
Now if each model is being translated to an internal representation, how can we be sure that something hasn’t been lost in translation? OpenCV has published several test results indicating that there’s absolutely no difference in accuracy between say using the DNN module and ResNet 50 and the actual results from ResNet 50. This means that you’ll get the same results regardless of whether you use OpenCV’s DNN module or the original architecture.