Generate Generic C/C++ Code for Deep Learning Networks in Simulink
As of R2021a, you can use Simulink Coder and Embedded Coder to generate generic ANSI/ISO-compliant C and C++ code, free from dependence on third-party deep learning libraries, for Simulink models containing deep learning networks. Incorporate convolutional and recurrent neural networks into Simulink with blocks from the Deep Learning Toolbox, including the Stateful Predict and Stateful Classify blocks. Apply pre- and post-processing to time-series data for use within Simulink models. Verify the numerical equivalence of the generated code with software-in-the-loop (SIL) and processor-in-the-loop (PIL) testing support from Embedded Coder. Deploy the generated code to any processor that can compile C and C++, including ARM Cortex-M processors, digital signal processors, microcontrollers, and more. For example, deploy a long short-term memory (LSTM) network to an STM32F7 Discovery Board with the Embedded Coder Hardware Support Package for STMicroelectronics Discovery Boards.
As of R2021a, you can now use Simulink Coder and Embedded Coder to generate generic C and C++ code from Simulink models containing deep neural networks. The generic C and C++ code does not depend on third-party libraries and, as a result, can be deployed to any embedded processor that compiles C and C++ code, including ARM Cortex-M series processors, DSPs, and microcontrollers from a range of device vendors.
Generic C and C++ code can be generated for both convolutional and recurrent deep learning neural networks in Simulink. You can incorporate network objects into your Simulink models with blocks from the deep learning toolbox, including the Stateful Classify and Stateful Predict blocks introduced in R2021a or with the MATLAB function block. A variety of networks and layers are supported for generic C and C++ code generation. The full list can be found in our documentation.
Now let's deploy a deep learning network from Simulink to an ARM Cortex-M processor. In this example, we'll use an LSTM network to predict the remaining useful life of turbofan engines. The pre-trained network accepts sequence data from 17 engine sensors and outputs a prediction of the engines' remaining useful life, measured in cycles. The time series input data is fed to the model with a firm workspace block and is then sent to the predict neural network block.
Using the Deep Network Designer app, we can take a deeper look into the network and see that it contains six layers, including an LSTM layer. Let's simulate the model in Simulink. From the simulation, we can see that the pre-trained neural network forecast the remaining useful life of the turbofan engines in each of the observations shown here relatively well. The predicted values closely follow the actual values, with a root mean square error of 20.37.
Now let's generate generic C code from the model and deploy it to an ARM Cortex-M processor. In configuration parameters, we'll select the STM32F746G-Discovery board as our hardware board. Under Code Generation, we've set the system target file to use Embedded Coder and the target language to C. We'll use the new tools for ARM embedded processors for our toolchain. Finally, we'll ensure the deep learning target library is set to none to remove any dependencies on third-party libraries. With these settings in place, let's generate code for the subsystem containing the neural network.
In the code-generation report, we can see the files generated do not include any external deep learning libraries. Let's search for the model step function in the generated code. Looking at the step function, we can see that it contains a predict method. Inside the predict method, we can see a set of weights and biases defined for use in the neural network. Scrolling down, we can see the code used to calculate the outputs of the neural net for the time series inputs passed in at each time step.
With the code successfully generated, let's deploy it to our target. We'll use processor to loop execution to numerically verify the generated code's output. In the second model, we've placed the same predict block which we saw previously inside of a model reference and set its simulation mode to PIL. In the SIL/PIL manager app, we'll set our simulation mode to SIL/PIL only to collect results from running the generated code with PIL execution and choose model blocks in SIL/PIL mode as our system under test.
Post-simulation, we'll compare the prediction outputs from running the generic C code on hardware to those from Simulink. Now let's run the generated code. Code is first generated for the model reference block in PIL mode. A connection is established with our Cortex-M Discovery board. And finally, the source code is built and downloaded to it.
The left set of plots show the results from running the generated code on hardware closely follows the actual test data. To verify the code's numerical accuracy, we created a second set of plots on the right to show the differences in predictions while running the simulation in Simulink and in PIL mode on our target hardware. All 10 observations show the deviation to be within a tolerance band of about 500,000ths, a negligible effect on the neural network's accuracy.
Overall, we've seen how it's now possible to generate generic C and C++ code from deep learning networks in Simulink that do not depend on optimization libraries, effectively extending its use to nearly any embedded processor. Additionally, with processor in the loop testing, we verified that the output from the generated code matches that from the Simulink simulation. To learn more about deep learning code generation and MATLAB in Simulink, please click one of the links below or refer to our documentation.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.