C Code Optimization using Codegen for Qualcomm Hexagon DSP
This example demonstrates the workflow of generating optimized C code in MATLAB® using Codegen for the Qualcomm Hexagon Simulator.
The example utilizes a dsp.FIRFilter System object to filter two sine waves with different frequencies using the Embedded Coder® Support Package for Qualcomm® Hexagon® Processors and the Qualcomm Hexagon QHL Code Replacement Library (CRL).
Supported Hardware
Qualcomm Hexagon Simulator
Qualcomm Hexagon Android Board
Prerequisites
Launch the hardware setup and install the Qualcomm SDK. For more information, see Launch Hardware Setup.
Required Hardware
To run this example, you need the following hardware:example:
Supported Qualcomm Hexagon Simulator
Create MATLAB Coder Configuration Object
Create MATLAB Coder configuration object by running this code on MATLAB command.
cfg = coder.config('lib','ecoder',true); % Configure the hardware and select the processor_version cfg.Hardware = coder.Hardware('Qualcomm Hexagon Simulator'); % cfg.Hardware.CPUClockRate = 300; % Set clock to 300 MHZ cfg.Hardware.ProcessorVersion = 'V68'; % Set processor version to v68 cfg.CodeReplacementLibrary = "Qualcomm Hexagon QHL"; % Enable code replacement report, useful to analyze the replacement functions cfg.GenerateCodeReplacementReport = true; % Optionally for PIL verification, set the these configuration cfg.VerificationMode = "PIL"; cfg.CodeExecutionProfiling = true; % Steps for enabling code generation % For generating report with Hexagon Profiler or gprof set the following: config.Hardware.Profiler = "Profiler Name"; % Profiler Name can be one of the following. % a. None % b. Hexagon Profiler % c. gprof % d. Hexagon Profiler and gprof % config.Hardware.SimulatorOptions = "Simulator Options"; % Simulator Options are optional and are provided in the Hexagon Simulator Documentation For example: ‘--timing’ % config.Hardware.ProfilerOptions = "Profiler Options"; %Hexagon Profiler Options are optional and are provided in Hexagon Profiler Documentation.
Alternatively, the configuration object can also be configured using the GUI with this command.
cfg.dialog
2. Configure the Hardware board, Hardware configuration, and Build settings.
3. Configure the Code replacement library.
4. To debug, enable the code replacement report.
Generate Code
Run this command to generate code and launch the code generation report.
numSamples = 160; codegen -config cfg ex_fir_hexagon_ml -args {zeros(numSamples,1,'single')} -launchreport
### Connectivity configuration for function 'ex_fir_hexagon_ml': 'Hexagon Simulator' Code generation successful: View report
To view the replacement hits, navigate to the summary section and click on Code Replacements.
Numerical verification
To verify numerical accuracy, you need a test script that compares the MATLAB implementation output with the output of the generated PIL MEX file.
sin1 = dsp.SineWave('Amplitude',1,'Frequency',1000,... 'SampleRate',16000, 'SamplesPerFrame', numSamples,... 'OutputDataType', 'single'); sin2 = dsp.SineWave('Amplitude',4,'Frequency',5000,... 'SampleRate',16000, 'SamplesPerFrame', numSamples,... 'OutputDataType', 'single'); numSteps = 200; frameLength = sin1.SamplesPerFrame; yRef = zeros(frameLength,1,numSteps,'single'); y = zeros(frameLength,1,numSteps,'single'); for k = 1:numSteps x1k = sin1(); % generate 1KHz sine wave x5k = sin2(); % generate 5KHz sine wave n = randn(size(x1k), 'single')*sqrt(.05); % generate noise signal u = x1k+x5k+n; % Run with MATLAB code on host machine yRef(:,:,k) = ex_fir_hexagon_ml(u); % Run with generated code on target y(:,:,k) = ex_fir_hexagon_ml_pil(u); end
### Starting application: 'codegen\lib\ex_fir_hexagon_ml\pil\ex_fir_hexagon_ml.elf' To terminate execution: clear ex_fir_hexagon_ml_pil Execution profiling data is available for viewing. Open Simulation Data Inspector. Execution profiling report will be available after termination.
clear ex_fir_hexagon_ml; clear ex_fir_hexagon_ml_pil;
### Stopping application. Execution profiling report: coder.profile.show(getCoderExecutionProfile('ex_fir_hexagon_ml'))
After execution, you can compare the outputs of the PIL against the reference using any norm function.
% norm comparison of y & yRef absoluteError = norm(y(:)-yRef(:),'inf'); fprintf("Absolute error = %g \n",absoluteError);
Absolute error = 8.34465e-07
Alternatively, you can use the verifyEqual
function from the matlab.unittest.TestCase.forInteractiveUse
test class. This function allows you to compare the outputs against the given absolute and relative tolerances, concluding the overall results.
reltol = single(1e-5); abstol = single(1e-5); matlab.unittest.TestCase.forInteractiveUse.verifyEqual(y,yRef, ... 'RelTol',reltol, 'AbsTol', abstol);
Verification passed.
Similar to Simulink PIL, you can use the Simulation Data Inspector to visualize and compare the previously mentioned outputs. To achieve this, represent the output in the timeseries format.
% Create a timeseries timeSteps = 1:numSteps; yTS = timeseries(y,timeSteps,"Name","PIL output(y)"); yRefTS = timeseries(yRef,timeSteps,"Name","MATLAB output (yRef)"); % open Simulation data inspector Simulink.sdi.view
To import the earlier time series data into the Simulation Data Inspector, select Import. In the Import dialog box, you can choose to import data from the workspace.
After importing, set the yRef to Set as Baseline and y to Set as Compare to.
After importing and configuring the baseline and compare-to settings, navigate to the Compare tab in the SDI. Click Compare to visualize the output of each sample at each time step. To set the tolerance limit, set Global Tolerances under the [+] More section.
Analyze Performance using Code Profile Analyzer
Run this command to show the coder execution profile.
coder.profile.show(getCoderExecutionProfile('ex_fir_hexagon_ml'))
Consider the average execution time of the ex_fir_hexagon_ml function which is 17072 cycles. Now, by repeating the code profiling by changing the Code replacement library to None, and the previous function took 42188 cycles. This indicates that by enabling the CRL, the cycle performance of the ex_fir_hexagon_ml improved by approximately 50%.
The average execution time of the ex_fir_hexagon_ml
function is 17,072 cycles. By repeating the code profiling with the Code Replacement Library set to None, the function took 42,188 cycles. This indicates that enabling the CRL improved the cycle performance of ex_fir_hexagon_ml
by approximately 50%.
Note: For Qualcomm Hexagon QHL CRL, explicit alignment specification for buffers is necessary only when the input/output variables are directly used in the operators, functions and system-objects.