Cross-Compile Deep Learning Code That Uses ARM Compute Library
On the computer that hosts your MATLAB® session, you can generate deep learning source code and compile it to create a library or an executable that runs on a target ARM® hardware device. The compilation of source code on one platform to create binary code for another platform is known as cross-compilation. This workflow is supported only for the Linux® host platform and target devices that have armv7 (32-bit) or armv8 (64-bit) ARM architecture.
Use this workflow to deploy deep learning code on ARM devices that do not have hardware support packages.
Note
The ARM Compute library version that the examples in this help topic uses might not be the latest version that code generation supports. For supported versions of libraries and for information about setting up environment variables, see Prerequisites for Deep Learning with MATLAB Coder.
Prerequisites
These are the prerequisites specific to the cross-compilation workflow. For the general prerequisites, see Prerequisites for Deep Learning with MATLAB Coder.
The target device must have armv7 (32-bit) or armv8 (64-bit) ARM architecture. To verify the architecture of your device run this command in the terminal of the device:
arch
You must have the Linaro AArch32 or AArch64 toolchain installed on the host computer.
For armv7 target, install the GNU/GCC
g++-arm-linux-gnueabihf
toolchain on the host.For armv8 target, install the GNU/GCC
g++-aarch64-linux-gnu
toolchain on the host.
For example, to install the Linaro AArch64 toolchain on the host, run this command in the terminal:
sudo apt-get install g++-aarch64-linux-gnu
At the MATLAB command line, set the environment variable
LINARO_TOOLCHAIN_AARCH32
orLINARO_TOOLCHAIN_AARCH64
for the path of the toolchain binaries. You must set the path once per MATLAB session.Suppose that the toolchain is installed at the location
/usr/bin
in the host.For armv7 target, run this command:
setenv('LINARO_TOOLCHAIN_AARCH32', '/usr/bin')
For armv8 target, run this command:
setenv('LINARO_TOOLCHAIN_AARCH64', '/usr/bin')
Cross-compile the ARM Compute library on the host:
Clone the Git™ repository for ARM Compute library and check out the version you need. For example, to check out v19.05, run these commands in the host terminal:
git clone https://github.com/Arm-software/ComputeLibrary.git cd ComputeLibrary git tag -l git checkout v19.05
Install scons on the host. For example, run this commands in the host terminal:
sudo apt-get install scons
Use scons to cross-compile the ARM Compute library on host. For example, to build the library to run on armv8 architecture, run this command in the host terminal:
scons Werror=0 -j8 debug=0 neon=1 opencl=0 os=linux arch=arm64-v8a openmp=1 cppthreads=1 examples=0 asserts=0 build=cross_compile
At the MATLAB command line, set the environment variable
ARM_COMPUTELIB
for the path of the ARM Compute library. You must set the path once per MATLAB session.Suppose that the ARM Compute library is installed at the location
/home/$(USER)/Desktop/ComputeLibrary
. Run this command at the MATLAB command line:setenv('ARM_COMPUTELIB','/home/$(USER)/Desktop/ComputeLibrary')
The folder that contains the library files such as
libarm_compute.so
should be namedlib
. If the folder is namedbuild
, rename the folder tolib
.
Generate and Deploy Deep Learning Code
There are two possible workflows for cross-compiling deep learning code on your host computer and then deploying the code on the target ARM hardware. Here is a summary of the two workflows. For an example that demonstrates both workflows, see Cross Compile Deep Learning Code for ARM Neon Targets.
On the host computer, you generate a static or dynamic library for deep learning code. Follow these steps:
On the host, use the
codegen
command to generate and build deep learning code to create a static or dynamic library.Copy the generated library, the ARM Compute library files, the makefile, and other supporting files to the target hardware.
Compile the copied makefile on the target to create an executable.
Run the generated executable on the target hardware.
On the host computer, you generate an executable for deep learning code. Follow these steps:
On the host, use the
codegen
command to generate and build deep learning code to create an executable.Copy the generated executable, the ARM Compute library files, and other supporting files to the target hardware.
Run the executable on the target hardware.