Image Classification & Machine Learning Tutorial | Qt & TensorFlow

Original article was published on Artificial Intelligence on Medium


Image Classification & Machine Learning Tutorial | Qt & TensorFlow

Artificial intelligence and smart applications are steadily becoming more popular. Companies strongly rely on AI systems and machine learning to make faster and more accurate decisions based on their data.

This guide provides an example for Image Classification and Object Detection built with Google’s TensorFlow Framework.

By reading this post, you will learn how to:

  • Build TensorFlow for Android, iOS and Desktop Linux.
  • Integrate TensorFlow in your Qt-basedFelgoproject.
  • Use the TensorFlow API to run Image Classification and Object Detection models.

Why Add Artificial Intelligence to Your Mobile App

As of 2017, a quarter of organizations already invest more than 15 percent of their IT budget in machine learning. With over 75 percent of businesses spending money and effort in Big Data, machine learning is set to become even more important in the future.

Real-World Examples of Machine Learning

Artificial intelligence is on its way to becoming a business-critical technology, with the goal of improving decision-making with a far more data-driven approach. Regardless of the industry, machine learning helps to make computing processes more efficient, cost-effective, and reliable. For example, it is used for:

  • Financial Services: To track customer and client satisfaction, react to market trends or calculate risks. E.g. PayPal uses machine learning to detect and combat fraud.
  • Healthcare: For personalized health monitoring systems, to enable healthcare professionals to spot potential anomalies early on. Have a look at the latest examples of AI in healthcare.
  • Retail: Offer personalized recommendations based on your previous purchases or activity. For example, recommendations on Netflix or Spotify.
  • Voice Recognition Systems, like Siri or Cortana.
  • Face Recognition Systems, like DeepLink by Facebook.
  • Spam Email Detection and Filtering.

Image Classification and Object Detection Example

TensorFlow is Google’s open machine learning framework. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and architectures (desktops, clusters of servers, mobile, and edge devices). It supports Linux, macOS, Windows, Android, and iOS among others.

About TensorFlow

TensorFlow has different flavors. The main one is TensorFlow. Another one is TensorFlow Lite which is TensorFlow’s lightweight solution for mobile and embedded devices. However, TensorFlow Lite is currently at technological preview state. This means that not all TensorFlow features are currently supported, although it will be the reference for mobile and embedded devices in the near future.

There is plenty of online material about how to build applications with Tensorflow. To begin with, we highly recommend the free ebook Building Mobile Applications with TensorFlow by Pete Warden, lead of the TensorFlow mobile/embedded team.

The example of this guide makes use of the original TensorFow flavor. It shows how to integrate TensorFlow with Qt and Felgoto to create a simple multiplatform app that includes two pre-trained neural networks, one for image classification and another one for object detection. The code of this example is hosted on GitHub.

Clone the Repository

To clone this repository execute the following command, clone it recursively since the TensorFlow repository is inside it. The Tensorflow version included is 1.8.

Run this code on your iOS or Android device now, with Live Code Reloading

Many thanks to the project developers for sharing this example and preparing this guide:

  • Javier Bonilla, Ph.D. in Computer Science doing research about modeling, optimization and automatic control of concentrating solar thermal facilities and power plants atCIEMAT — Plataforma Solar de Almería (PSA), one of the largest concentrating solar technology research, development and test centers in Europe.
  • Jose Antonio Carballo, Mechanical Engineer and Ph.D. student fromUniversity of Almeríaworking on his doctoral thesis on modeling, optimization and automatic control for efficient use of water and energy resources in concentrating solar thermal facilities and power plants atCIEMAT — Plataforma Solar de Almería (PSA).

Advantages of using Felgo and Qt with TensorFlow

Felgo and Qt are wonderful tools for multiplatform applications. Qt has a rich set of ready-to-use multiplatform components for diverse areas such as multimedia, network and connectivity, graphics, input methods, sensors, data storage and more. Felgo further contributes to ease the deployment to mobile and embedded devices and adds nice features such as resolution and aspect ratio independence and additional components and controls. Felgo also provides easier access to native features, as well as plugins for monetization, analytics, cloud services and much more.

One nice feature of Felgo is that it is not restricted to mobile devices, so you can test and prototype your app in your development computer, which is certainly faster than compiling and deploying your app to emulators. You can even use Felgo live reloading to see changes in code almost instantaneously. Live reloading is also supported on Android and iOS devices, which is perfect for fine-tuning changes or testing code snippets on mobile devices.

So Tensorflow provides the machine learning framework, whereas Felgo and Qt facilitate the app deployment to multiple platforms: desktop and mobile.

Get Qt training and consulting service if you need help with that.

How to Build TensorFlow for Qt

We need to build TensorFlow for each platform and architecture. The recommended way is to use bazelbuild system. However, we will explore how to use make to build TensorFlow for Linux, Android and iOS in this example. Check that you have installed all the required libraries and tools, TensorFlow Makefile readme.

If you are interested in building Tensorflow for macOS, check the Supported Systems section on the Makefile readme. For Windows, check TensorFlow CMake build.

If you have issues during the compilation process have a look at open Tensorflow issues or post your problem there to get help.

Once you have built Tensorflow, your app can link against these three libraries: libtensorflow-core.a, libprotobuf.aand libnsync.a.

Note: When you build for different platforms and architectures, in the same Tensorflow source code folder, Tensorflow may delete previously compiled libraries, so make sure you back them up. These are the paths where you can find those libraries, with MAKEFILE_DIR=./tensorflow/tensorflow/contrib/makefile:

  • Linux
    – libtensorflow-core: $(MAKEFILE_DIR)/gen/lib
    – libprotobuf: $(MAKEFILE_DIR)/gen/protobuf/lib64
    – libsync: $(MAKEFILE_DIR)/downloads/nsync/builds/default.linux.c++11/
  • Android ARM v7
    – libtensorflow-core: $(MAKEFILE_DIR)/gen/lib/android_armeabi-v7a
    – libprotobuf: $(MAKEFILE_DIR)/gen/protobuf_android/armeabi-v7a/lib/
    – libsync: $(MAKEFILE_DIR)/downloads/nsync/builds/armeabi-v7a.android.c++11/
  • Android x86
    – libtensorflow-core: $(MAKEFILE_DIR)/gen/lib/android_x86
    – libprotobuf: $(MAKEFILE_DIR)/gen/protobuf_android/x86/lib/
    – libsync: $(MAKEFILE_DIR)/downloads/nsync/builds/x86.android.c++11/
  • iOS
    – libtensorflow-core: $(MAKEFILE_DIR)/gen/lib
    – libprotobuf: $(MAKEFILE_DIR)/gen/protobuf_ios/lib/
    – libsync: $(MAKEFILE_DIR)/downloads/nsync/builds/arm64.ios.c++11/

The shell commands in the following sections only work if executed inside the main Tensorflow folder.

Building for Linux

We just need to execute the following script for Linux compilation.

Run this code on your iOS or Android device now, with Live Code Reloading

If you are compiling for the 64-bit version, you might run into the following compilation error:

Run this code on your iOS or Android device now, with Live Code Reloading

In this case, change the$(MAKEFILE_DIR)/gen/protobuf-host/libreferences to$(MAKEFILE_DIR)/gen/protobuf-host/lib64in the tensorflow/tensorflow/contrib/makefile/Makefile file.

With some GCC 8 compiler versions, you can get the following error.

Run this code on your iOS or Android device now, with Live Code Reloading

To avoid it, include the-Wno-error=class-memaccessflag in the PLATFORM_CFLAGSvariable for Linux ( case "$target_platform" in linux) in the tensorflow/tensorflow/contrib/makefile/compile_nsync.sh file.

Building for Android (on Linux)

First, you need to set the NDK_ROOTenvironment variable to point to your NDK root path. You cand download it from this link. Second, you need to compile the cpu features library in NDK. This example was tested with Android NDK r14e.

Run this code on your iOS or Android device now, with Live Code Reloading

Then, execute the following script to compile Tensorflow for ARM v7 instructions.

Run this code on your iOS or Android device now, with Live Code Reloading

If you want to compile for x86 platforms. For instance, for debugging in an Android emulator, execute the same command with the following parameters.

Note: If you face issues compiling for Android x86 with Android NDK r14, use the Android NDK r10e and set the NDK_ROOT accordingly to its path.

Run this code on your iOS or Android device now, with Live Code Reloading

The Tensorflow Android supported architectures are the following.

Run this code on your iOS or Android device now, with Live Code Reloading

Building for iOS (on macOS)

The following script is available to build Tensorflow for iOS on macOS.

Run this code on your iOS or Android device now, with Live Code Reloading

If you get the following error while building Tensorflow for iOS.

Run this code on your iOS or Android device now, with Live Code Reloading

You can avoid it performing the changes given in this comment. That is changing-D__thread=thread_local \to-D__thread= \in the Makefile(for the i386 architecture only).

How to Use TensorFlow in Your Qt Mobile App

The source code of the app is in a GitHub repository. This section walks through the app code.

Link TensorFlow in Your Project

The following code shows the lines added to our qmake project file in order to include the TensorFlow header files and link against TensorFlow libraries depending on the target platform.

For Android, ANDROID_NDK_ROOTwas set to the path of Android NDK r14e and ANDROID_NDK_PLATFORMwas set to android-21 in Qt Creator (Project -> Build Environment).

Run this code on your iOS or Android device now, with Live Code Reloading

Create the GUI with QML

The GUI is pretty simple, there are only two pages.

  • Live video output page: The user can switch between the front and rear cameras.
  • Settings page: Page for setting the minimum confidence level and selecting the model: one for image classification and another one for object detection.

Main.qml

In main.qml, there is a Storagecomponent to load/save the minimum confidence level, the selected model and if the inference time is shown. The inference time is the time taken by the Tensorflow neural network model to process an image. The storage keys are kMinConfidence, kModeland kShowTime. Their default values are given by defMinConfidence, defModeland defShowTime. The actual values are stored in minConfidence, modeland showTime.

Run this code on your iOS or Android device now, with Live Code Reloading

There is a Navigation component with two NavigationItem, each one is a Page. The VideoPageshows the live video camera output. It reads the minConfidence, modeland showTimeproperties. The AppSettingsPagereads also those properties and set their new values in the onMinConfidenceChanged, onModelChangedand onShowTimeChanged events.

Run this code on your iOS or Android device now, with Live Code Reloading

VideoPage.qml

A screenshot of the VideoPage for object detection on iOS is shown below.

The QtMultimedia module is loaded on this page.

Run this code on your iOS or Android device now, with Live Code Reloading

The VideoPagehas the minConfidence, modeland showTimeproperties. It also has another property to store the camera index, cameraIndex.

Run this code on your iOS or Android device now, with Live Code Reloading

There is a camera component that is started and stopped when the page is shown or hidden. It has two boolean properties. The first one is true if there is at least one camera and the second one is true if there are at least two cameras.

Run this code on your iOS or Android device now, with Live Code Reloading

There is also a button in the navigation bar to switch the camera. This button is visible only when there is more than one camera available. The initialRotation()function is required due to the Qt bug 37955, which incorrectly rotates the front camera video output on iOS.

Run this code on your iOS or Android device now, with Live Code Reloading
Run this code on your iOS or Android device now, with Live Code Reloading

When no camera is detected, an icon and a message are shown to the user.

Run this code on your iOS or Android device now, with Live Code Reloading

When the camera is loading, an icon with cool animation and a message are also shown to the user.

Run this code on your iOS or Android device now, with Live Code Reloading

The camera video output fills the whole page. It is only visible when at least
one camera is detected and active. We define a filter objectsRecognitionFilter that is implemented in a C++ class. This filter gets each video frame, transforms it as input data to TensorFlow, invokes TensorFlow and draws the results over the video frame. This C++ class will be later introduced.

Run this code on your iOS or Android device now, with Live Code Reloading

AppSettingsPage.qml

A screenshot of this page on iOS is shown below.

The AppSettingsPageallows the user to select the minimum confidence level for the detections with a slider. The slider value is stored in minConfidence.

Run this code on your iOS or Android device now, with Live Code Reloading

The inference time, the time Tensorflow takes to process an image can be also shown on the screen. It can be enabled or disabled by means of a switch. The boolean value is stored in showTime.

Run this code on your iOS or Android device now, with Live Code Reloading

There are also two exclusive checkboxes to select the model: one for image classification and another for object detection. The selected model is stored in the `model` property. If the currently selected model is unchecked, the other model is automatically checked, as one of them should be always selected.

Run this code on your iOS or Android device now, with Live Code Reloading

C++ TensorFlow Interface and Video Frame Filter

Two main tasks are programmed in C++.

  • Interfacing with TensorFow
  • Managing video frames

The source code of the C++ classes is not presented here in detail, instead, the process is sketched and explained, links to further details are also given. Nevertheless, you can have a look at the source code hosted on GitHub.

Interfacing with Tensorflow

The TensorflowC++ class interfaces with the TensorFlow library, check the code for a detailed description of this class. This class is a wrapper, check the Tensorflow C++ API documentation for further information.

Managing video frames

The workflow for managing video frames is shown in the next flow diagram.

An object filter, ObjectsRecognizer, is applied to the VideoOutputto process frames. This filter is implemented by means of the C++ classes: ObjectsRecogFilterand ObjectsRecogFilterRunable, for further information about how to apply filters, check introducing video filters in Qt Multimedia.

The filter is processed in the `run` method of the ObjectsRecogFilter class. The general steps are the following.

  1. We need to convert ourQVideoFrameto aQImageso we can manipulate it.
  2. We check if Tensorflow is running. Since Tensorflow is executed in another thread, we used theQMutexandQMutexLockerclasses to thread-safety check if it is running. A nice example is given inQMutexLocker Class documentation.
    – If Tensorflow is running — nothing is done
    – If Tensorflow is NOT running — we execute it in another thread by means of the C++ classes:TensorflowThreadandWorkerTF, signals and slots are used to communicate the main thread and these classes, check [QThreads general usage](https://wiki.qt.io/QThreads_general_usage) for further details. We provide as input the video frame image. When Tensorflow is finished we store the results given by the selected model also by means of signals and slots.
  3. We get the stored results (if any) and apply them to the current video frame image. If our model is image classification, we just draw the name and score of the top image class if the score is above the minimum confidence value. If our model is object detection, we iterate over all the detections and draw the bounding boxes, names of objects and confidence values if they are above the minimum confidence level. There is an auxiliary C++ class,AuxUtils, which provides functions to draw on frames, such asdrawTextanddrawBoxes.
  4. The last step is to convert back ourQImageto aQVideoFrameto be processed by our QMLVideoOutputcomponent and then we go back to process a new video frame.

Neural Network Models for Image Classification and Object Detection

We need neural network models to perform image classification and object detection tasks. Google provides a set of pre-trained models that do this. The file extension for Tensorflow frozen neural network models is.pb. The example on Github already includes MobileNetmodels: MobileNet V2 1.0_224for image classification and SSD MobileNet V1 coco for object detection. MobileNets is a class of efficient neural network models for mobile and embedded vision applications.

Image Classification Models

Image classification models can be download from the TensorFlow-Slim image classification model library. Our example code is designed for MobileNetneural networks. For example, download mobilenet_v2_1.0_224.tgz, uncompress it and copy the mobilenet_v2_1.0_224_frozen.pbfile to our assetsfolder as image_classification.pb. The image size in this case, 224 x 224pixels, is set in the constants fixed_widthand fixed_heightdefined in our TensorflowC++ class. The output layer, MobilenetV2/Predictions/Reshape_1in this case, is also specified in the constant list variable listOutputsImgClain the Tensorflowclass. Labels for these models are already set in the image_classification_labels.txtfile. Labels belong to ImageNet classes.

Object Detection Models

Check Tensorflow detection model Zoo for a comprehensive list of object detection models. Any SSD MobileNetmodel can be used. This kind of models provides caption, confidence and bounding box outputs for each detected object. For instance, download ssd_mobilenet_v1_coco_2018_01_28.tar.gzand uncompress it, copy the frozen_inference_graph.pbto our assetsfolder as object_detection.pb. Labels for this kind of models are already given by the object_detection_labels.txtfile. Labels belong to COCO labels.

Known Issues

Although the presented example is functional, there is still room for improvement. Particularly in the C++ code where naive solutions were considered for simplicity.

There are also some issues to address, the following list summarizes them.

  • The app performance is much higher on iOS than on Android even for high-end mobile devices. Finding the root cause of this requires further investigation.
  • Thespmethod of theAuxUtilsC++ class is intended to provide font pixel sizes independently on the screen size and resolution, although it does not work for all devices. Therefore, the same implementation that the one provided by the Felgo QMLspfunction should be considered.
  • Asset files can be easily accessed from QML and Qt classes. For instance,assets:/assets/model.pbit gives access to a file calledmodel.pbstored in the assets folder on Android. However, accessing assets from general C++ classes is not so easy because those classes can not resolveassets:/. This is the case for the Tensorflow C++ class. The current solution is to copy the file to a well-known path, for example toQStandardPaths::writableLocation(QStandardPaths::AppLocalDataLocation), but this involves checking if the destination folder exists (and create it otherwise), checking if the asset file exists and has not changed (and copy it otherwise).
  • QVideoFrameconversion toQImageis performed in order to draw on it in therunmethod of theObjectsRecogFilterRunableC++ class. Currently, this is done using theqt_imageFromVideoFramefunction included in a Qt private module:multimedia-private. Therefore, the app is tied to this specific Qt module build version and running the app against other versions of the Qt modules may crash at any arbitrary point. Additionally, the conversion of BGR video frames is not properly managed by theqt_imageFromVideoFramefunction. Therefore, they are converted to images without using this function.
  • The current implementation continuously executes Tensorflow in a separated thread processing video frames. That is when the Tensorflow thread finishes, it is executed again with the latest frame. This approach provides fluent user experience, but on the other hand, it makes the device considerably heat up and drain the battery fast.

If you have a business request for assistance to integrate TensorFlow in your Felgo apps, don’t hesitate to drop a line at support@felgo.comor contact us here. The Felgo SDK is free to use, so make sure to check it out!

If you enjoyed this post, feel free to share it on Facebook or Twitter.

More Relevant App Development Resources

The Best App Development Tutorials & Free App Templates

All of these tutorials come with full source code of the mobile apps! You can copy the code to make your own apps for free!

App Development Video Tutorials

Make Cross-Platform Apps with Qt: Felgo Apps

How to Add In-App Chat or Gamification Features to Your Mobile App

How to Make a Mobile App with Qt Quick Designer (QML Designer) & Felgo