Source: Deep Learning on Medium
Deep Learning on ARM Processors — Designing Neural Network Solutions for Low-Power Edge Devices
Machine learning (ML) algorithms are moving to the IoT edge due to various considerations such as latency, power consumption, cost, network bandwidth, reliability, privacy and security. Hence, there is an increasing interest in developing Neural Network (NN) solutions to deploy them on low-power edge devices such as the Arm Cortex-M microcontroller systems. CMSIS-NN is an open-source library of optimized software kernels that maximize NN performance on Cortex-M cores with minimal memory footprint overhead.
Apart from CMSIS-NN, there are also different Open Sources libraries like ARM NN and the ARM Compute Library which are open-source libraries for optimizing ML on Arm-based processors and IoT edge devices. ML workloads can run entirely on edge devices, allowing sophisticated AI-enabled software to run almost anywhere, even without network access.
What do we need for this tutorial?
- STMF4 Board
2. USB Cable
3. Keil Software (MDK-Arm)
4. Tera Term Software
- Create a new project with Keil MDK-ARM. (For e.g. — simple_nn)
2. After creating the new project, pop-up windows comes to select the target device. (For e.g.- STM32F411VETx)
3. In the Manage Run-Time Environment window select the “Core” of CMSIS and select the “Startup” in the Device menu and proceed.
4. (a)Change the target name(I have given the name of the board.)
(b) Change the source group name to your preferred name. (I have given app)
(c) Add a new group by right-clicking on the target and name it as “nn_lib” (You can give any name).
(d) Add another group and name it as drivers.
Now we have created 3 groups which will hold different files for Neural Networks.
app : This will hold the main program of the NN.
nn_lib : This will hold the NN libraries needed for implementation.
drivers: This will hold the UART driver files for communication between the computer and the microcontroller.
5. Right-click on the app group and click on add a new item in the group. Create a “main.c” file. Do the same thing for nn_lib app and create 2 files one with .c extension and other with .h extension to include the library files. (simple_neural_networks.c & simple_neural_networks.h files). Same thing for drivers (uart.c & uart.h files).
6. Include the contents of the file from my Github Repository.
The codes in the files are self-explanatory and very easy to understand.
7. Save the project and click on the build icon. (Marked in Red)
After the build, you will see a message something like this. (With No errors & You can ignore the warnings)
8. Right-click on the target name (stm32f4) and select “Options for Target”. You will see a pop-up window comes up and after that go the debug section and select “ST-Link Debugger”.
9. After selecting the debugger, load the program to the microcontroller. (Loading should take place without any errors but you can ignore the warnings.)
10. Loading is complete, now open Tera Term Software and select STMicroelectronics COM port in serial.
After selecting, we can visualize the output in the screen and infer the results.
The whole idea of this tutorial is to design and implement multiple inputs multiple outputs neural network on ARM Processors.
Limitations of Microcontrollers
- Limited Memory Footprint
- Limited Compute Resources
This article shows how you can design and implement NN and move AI from the cloud to the edge. Arm leads in this transition with hardware that’s tailored for inference and software components to help you implement your solutions no matter which ML framework you choose.
In the meanwhile, here are some resources I find useful to learn about ARM Cortex-M microcontrollers, STM32, CMSIS-NN, and Keil-MDK, etc.
Don’t forget to check out the source code from my GitHub page.