Code Generation for Sound Classification on ARM Cortex-M Targets using CMSIS-NN
This example shows how you can use the CMSIS-NN library and pretrained network to generate code for the melSpectrogram
(Audio Toolbox) example. In this example, you can classify white noise, brown noise and pink noise by generating a processor-in-the-loop (PIL) MEX function, which allows you to execute the generated code on target hardware, such as the STM32 Nucleo F767ZI board. The PIL interface in the MATLAB™ environment facilitates the execution of the generated executable on the target hardware.
Sound classification plays a crucial role in various applications such as speech recognition, audio surveillance, and environmental sound monitoring. Deep learning techniques have shown great potential in accurately classifying sound. The CMSIS-NN library is used to optimize deep learning models for execution on microcontrollers and embedded systems. It enables low-precision (int8) inference, leading to improved memory and computational efficiency.
To run this example, you need a Cortex-M™ hardware such as the STM32 Nucleo F767ZI board and the CMSIS-NN Library.
The workflow in this example includes: calibrating and generating a quantizer object, generating code using the quantizer object and network, validating the test results and measuring the performance gain.
Generate Calibration Result File
Load the pretrained network .mat
file. Create a dlquantizer
(Deep Learning Toolbox) (Deep Learning Toolbox) object and specify the network. Note that code generation does not support quantized deep neural networks produced by the quantize
(Deep Learning Toolbox) function.
load('soundClassificationNet.mat'); quantizedNet = dlquantizer(net, 'ExecutionEnvironment', 'CPU');
Load the training data .mat
file containing the featuresTrain
and labelsTrain
variables. The featuresTrain
variable contains the white noise, brown noise, and pink noise signals, and labelsTrain
variable contains their corresponding labels. You can use the calibrate
(Deep Learning Toolbox) function to train the network with a set of inputs and collect range information. This function collects the dynamic ranges of the weights and biases in the Long Short Term Memory (LSTM) and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network.
load('soundClassificationTrainingData.mat'); featuresDatastore = arrayDatastore(featuresTrain,"IterationDimension",1,"OutputType","same"); labelsTrain = cellstr(labelsTrain); labelsDatastore = arrayDatastore(labelsTrain,"IterationDimension",1,"OutputType","same"); data = combine(featuresDatastore,labelsDatastore); quantizedNet.calibrate(data);
Save the dlquantizer
object as a .mat
file to pass it to the codegen
function.
save('soundClassificationQuantObj.mat', 'quantizedNet')
Configure and Generate PIL MEX Function
The net_predict.m
entry point function uses the coder.loadDeepLearningNetwork
function to load a deep learning model and construct and set up a Recurrent Neural Network. The entry point function then predicts the responses by using the predict
(Deep Learning Toolbox) function.
type net_predict.m
% Copyright 2021-23 The MathWorks, Inc. function out = net_predict(netMatFile, in) persistent net if isempty(net) net = coder.loadDeepLearningNetwork(netMatFile); end out = net.predict(in); end
To generate a PIL MEX function, create a coder.config
object for a static library and set the verification mode to 'PIL'
. Set the target language to C. Limit the stack size to a reasonable size, for example, 512 bytes, as the default size is much larger than the memory available on the hardware board.
cfg = coder.config('lib', 'ecoder', true); cfg.VerificationMode = 'PIL'; cfg.StackUsageMax = 512; cfg.TargetLang = 'C'; cfg.CodeExecutionProfiling = true;
Create a deep learning configuration object coder.DeepLearningConfig
for the CMSIS-NN library.
dlcfg = coder.DeepLearningConfig('cmsis-nn');
To generate a code that performs low-precision (int8
) inference, assign the saved dlquantizer
object .mat
file to the coder.DeepLearningConfig
object. Set the DeepLearningConfig
property of coder.config
object cfg
to the coder.DeepLearningConfig
object dlcfg
.
dlcfg.CalibrationResultFile = 'soundClassificationQuantObj.mat';
cfg.DeepLearningConfig = dlcfg;
Create a coder.hardware
object for the STM32 Nucleo F767ZI board. Set the hardware property of the coder.config
object cfg
to coder.hardware
object hw
. In the following code, replace COM4
with the port number to which you have connected the Cortex-M hardware. On the Cortex-M hardware board, set the CMSISNN_PATH
environment variable to the location of the CMSIS-NN library build on the Cortex-M board. For more information on building a library and setting environment variables, see Prerequisites for Deep Learning with MATLAB Coder.
hardware = 'STM32 Nucleo F767ZI'; hw = coder.hardware(hardware); hw.PILInterface = 'Serial'; % Replace COM4 with the actual port number hw.PILCOMPort = 'COM4'; cfg.Hardware = hw;
Load the test data attached as a MAT-file that contains the featuresTest
and labelsTest
variables. The featuresTest
contains new white noise, brown noise, and pink noise signals and labelsTest
contains their corresponding labels.
load('soundClassificationTestingData.mat') args = {coder.Constant('soundClassificationNet.mat'), featuresTest{1}};
Use the codegen
command to generate a PIL MEX function.
codegen -config cfg net_predict -args args -report
Run Generated PIL MEX Function
Run the generated MEX function net_predict_pil
on test data and classify white noise, brown noise and pink noise signals.
outputPil = cell(size(featuresTest)); classNames = {'white', 'brown', 'pink'}; predictedClasses = cell(size(featuresTest)); for i = 1:numel(featuresTest) outputPil{i} = net_predict_pil('soundClassificationNet.mat', featuresTest{i}); [~, classIdx] = max(outputPil{i}); predictedClasses{i} = classNames{classIdx}; end
Create a confusion matrix chart from the true labels in labelsTest
and the predicted labels predictedClasses
.
cm = confusionchart(labelsTest,predictedClasses);
Measure Performance Gain
Terminate the PIL execution to measure the execution time of CMSIS-NN code.
clear net_predict_pil
Generate an execution profile report to evaluate execution time.
executionProfileCMSISNN = getCoderExecutionProfile('net_predict'); report(executionProfileCMSISNN, ... 'Units','Seconds', ... 'ScaleFactor','1e-03', ... 'NumericFormat','%0.4f') executionTimeCMSISNN = mean([executionProfileCMSISNN.Sections.ExecutionTimeInSeconds]);
Create a deep learning configuration object coder.DeepLearningConfig
to measure the performance gain when using CMSIS-NN library.
deepLearningCfg = coder.DeepLearningConfig('none');
cfg.DeepLearningConfig = deepLearningCfg;
Use the codegen
command to generate a PIL MEX function.
codegen('net_predict.m', '-config', cfg, '-args', args, '-report');
Run the generated MEX function net_predict_pil
on test data and terminate the PIL execution.
for i = 1:numel(featuresTest) outputPil{i} = net_predict_pil('soundClassificationNet.mat', featuresTest{i}); [~, classIdx] = max(outputPil{i}); predictedClasses{i} = classNames{classIdx}; end clear net_predict_pil
Measure execution time of plain C code by generating execution profile report.
executionProfilePlainC = getCoderExecutionProfile('net_predict'); report(executionProfilePlainC, ... 'Units','Seconds', ... 'ScaleFactor','1e-03', ... 'NumericFormat','%0.4f') executionTimePlainC = mean([executionProfilePlainC.Sections.ExecutionTimeInSeconds]);
Calculate the performance gain of CMSIS-NN over plain C.
CMSISNNPerformanceGainOverPlainC = executionTimePlainC ./ executionTimeCMSISNN
CMSISNNPerformanceGainOverPlainC = 1.3305
bar(["CMSIS-NN","plain C"],[executionTimeCMSISNN;executionTimePlainC]) ylabel('Execution Time (seconds)'); title('Performance Comparison of CMSIS-NN and plain C');