Specify Variable-Size Arguments for Code Generation

This example uses:

This example shows how to specify variable-size input arguments when you generate code for the object functions of classification and regression model objects. Variable-size data is data whose size might change at run time. Specifying variable-size input arguments is convenient when you have data with an unknown size at compile time. This example also describes how to include name-value pair arguments in an entry-point function and how to specify them when generating code.

For more detailed code generation workflow examples, see Code Generation for Prediction of Machine Learning Model at Command Line and Code Generation for Prediction of Machine Learning Model Using MATLAB Coder App.

Train Classification Model

Load Fisher's iris data set. Convert the labels to a character matrix.

load fisheriris
species = char(species);

Train a classification tree using the entire data set.

Mdl = fitctree(meas,species);

Mdl is a ClassificationTree model.

Save Model Using `saveLearnerForCoder`

Save the trained classification tree to a file named ClassTreeIris.mat in your current folder by using saveLearnerForCoder.

MdlName = 'ClassTreeIris';
saveLearnerForCoder(Mdl,MdlName);

Define Entry-Point Function

In your current folder, define an entry-point function named mypredictTree.m that does the following:

Accept measurements with columns corresponding to meas and accept valid name-value pair arguments.
Load a trained classification tree by using loadLearnerForCoder.
Predict labels and corresponding scores, node numbers, and class numbers from the loaded classification tree.

You can allow for optional name-value pair arguments by specifying varargin as an input argument. For details, see Code Generation for Variable Length Argument Lists (MATLAB Coder).

type mypredictTree.m  % Display contents of mypredictTree.m file

function [label,score,node,cnum] = mypredictTree(x,savedmdl,varargin) %#codegen
%MYPREDICTTREE Predict iris species using classification tree
%   MYPREDICTTREE predicts iris species for the n observations in the
%   n-by-4 matrix x using the classification tree stored in the MAT-file
%   whose name is in savedmdl, and then returns the predictions in the
%   array label. Each row of x contains the lengths and widths of the petal
%   and sepal of an iris (see the fisheriris data set). For other output
%   argument descriptions, see the predict reference page.
CompactMdl = loadLearnerForCoder(savedmdl);
[label,score,node,cnum] = predict(CompactMdl,x,varargin{:});
end

Note: If you click the button located in the upper-right section of this page and open this example in MATLAB®, then MATLAB® opens the example folder. This folder includes the entry-point function file.

Generate Code

Specify Variable-Size Arguments

Because C and C++ are statically typed languages, you must determine the properties of all variables in an entry-point function at compile time using the -args option of codegen.

Use coder.Constant (MATLAB Coder) to specify a compile-time constant input.

coder.Constant(v)

coder.Constant(v) creates a coder.Constant type variable whose values are constant, the same as v, during code generation.

Use coder.typeof (MATLAB Coder) to specify a variable-size input.

coder.typeof(example_value, size_vector, variable_dims)

The values of example_value, size_vector, and variable_dims specify the properties of the input array that the generated code can accept.

An input array has the same data type as the example values in example_value.
size_vector is the array size of an input array if the corresponding variable_dims value is false.
size_vector is the upper bound of the array size if the corresponding variable_dims value is true.
variable_dims specifies whether each dimension of the array has a variable size or a fixed size. A value of true (logical 1) means that the corresponding dimension has a variable size; a value of false (logical 0) means that the corresponding dimension has a fixed size.

The entry-point function mypredictTree accepts predictor data, the MAT-file name containing the trained model object, and optional name-value pair arguments. Suppose that you want to generate code that accepts a variable-size array for predictor data and the 'Subtrees' name-value pair argument with a variable-size vector for its value. Then you have four input arguments: predictor data, the MAT-file name, and the name and value of the 'Subtrees' name-value pair argument.

Define a 4-by-1 cell array and assign each input argument type of the entry-point function to each cell.

ARGS = cell(4,1);

For the first input, use coder.typeof to specify that the predictor data variable is double-precision with the same number of columns as the predictor data used in training the model, but that the number of observations (rows) is arbitrary.

p = numel(Mdl.PredictorNames);
ARGS{1} = coder.typeof(0,[Inf,p],[1,0]);

0 for the example_value value implies that the data type is double because double is the default numeric data type of MATLAB. [Inf,p] for the size_vector value and [1,0] for the variable_dims value imply that the size of the first dimension is variable and unbounded, and the size of the second dimension is fixed to be p.

The second input is the MAT-file name, which must be a compile-time constant. Use coder.Constant to specify the type of the second input.

ARGS{2} = coder.Constant(MdlName);

The last two inputs are the name and value of the 'Subtrees' name-value pair argument. Names of name-value pair arguments must be compile-time constants.

ARGS{3} = coder.Constant('Subtrees');

Use coder.typeof to specify that the value of 'Subtrees' is a double-precision row vector and that the upper bound of the row vector size is max(Mdl.PrunedList).

m = max(Mdl.PruneList);
ARGS{4} = coder.typeof(0,[1,m],[0,1]);

Again, 0 for the example_value value implies that the data type is double because double is the default numeric data type of MATLAB. [1,m] for the size_vector value and [0,1] for the variable_dims value imply that the size of the first dimension is fixed to be 1, and the size of the second dimension is variable and its upper bound is m.

Generate Code Using codegen

Generate a MEX function from the entry-point function mypredictTree using the cell array ARGS, which includes input argument types for mypredictTree. Specify the input argument types using the -args option. Specify the number of output arguments in the generated entry-point function using the -nargout option. The generate code includes the specified number of output arguments in the order in which they occur in the entry-point function definition.

codegen mypredictTree -args ARGS -nargout 2

Code generation successful.

codegen generates the MEX function mypredictTree_mex with a platform-dependent extension in your current folder.

The predict function accepts single-precision values, double-precision values, and 'all' for the 'SubTrees' name-value pair argument. However, you can specify only double-precision values when you use the MEX function for prediction because the data type specified by ARGS{4} is double.

Verify Generated Code

Predict labels for a random selection of 15 values from the training data using the generated MEX function and the subtree at pruning level 1. Compare the labels from the MEX function with those predicted by predict.

rng('default'); % For reproducibility
Xnew = datasample(meas,15);
[labelMEX,scoreMEX] = mypredictTree_mex(Xnew,MdlName,'Subtrees',1);
[labelPREDICT,scorePREDICT] = predict(Mdl,Xnew,'Subtrees',1);
labelPREDICT

labelPREDICT = 15x10 char array
    'virginica '
    'virginica '
    'setosa    '
    'virginica '
    'versicolor'
    'setosa    '
    'setosa    '
    'versicolor'
    'virginica '
    'virginica '
    'setosa    '
    'virginica '
    'virginica '
    'versicolor'
    'virginica '

labelMEX

labelMEX = 15x1 cell
    {'virginica' }
    {'virginica' }
    {'setosa'    }
    {'virginica' }
    {'versicolor'}
    {'setosa'    }
    {'setosa'    }
    {'versicolor'}
    {'virginica' }
    {'virginica' }
    {'setosa'    }
    {'virginica' }
    {'virginica' }
    {'versicolor'}
    {'virginica' }

The predicted labels are the same as the MEX function labels except for the data type. When the response data type is char and codegen cannot determine that the value of Subtrees is a scalar, then the output from the generated code is a cell array of character vectors.

For the comparison, you can convert labelsPREDICT to a cell array and use isequal.

cell_labelPREDICT = cellstr(labelPREDICT);
verifyLabel = isequal(labelMEX,cell_labelPREDICT)

verifyLabel = logical
   1

isequal returns logical 1 (true), which means all the inputs are equal.

Compare the second outputs as well. scoreMex might include round-off differences compared with scorePREDICT. In this case, compare scoreMEX and scorePREDICT, allowing a small tolerance.

find(abs(scorePREDICT-scoreMEX) > 1e-8)

ans =

  0x1 empty double column vector

find returns an empty vector if the element-wise absolute difference between scorePREDICT and scoreMEX is not larger than the specified tolerance 1e-8. The comparison confirms that scorePREDICT and scoreMEX are equal within the tolerance 1e–8.