Loss for cross-validated partitioned regression model
Find Cross-Validation Loss for Regression Ensemble
Find the cross-validation loss for a regression ensemble of the
carsmall data set and select displacement, horsepower, and vehicle weight as predictors.
load carsmall X = [Displacement Horsepower Weight];
Train an ensemble of regression trees.
rens = fitrensemble(X,MPG);
Create a cross-validated ensemble from
rens and find the k-fold cross-validation loss.
rng(10,'twister') % For reproducibility cvrens = crossval(rens); L = kfoldLoss(cvrens)
L = 28.7114
Display Individual Losses for Each Cross-Validation Fold
The mean squared error (MSE) is a measure of model quality. Examine the MSE for each fold of a cross-validated regression model.
carsmall data set. Specify the predictor
X and the response data
load carsmall X = [Cylinders Displacement Horsepower Weight]; Y = MPG;
Train a cross-validated regression tree model. By default, the software implements 10-fold cross-validation.
rng('default') % For reproducibility CVMdl = fitrtree(X,Y,'CrossVal','on');
Compute the MSE for each fold. Visualize the distribution of the loss values by using a box plot. Notice that none of the values is an outlier.
losses = kfoldLoss(CVMdl,'Mode','individual')
losses = 10×1 42.5072 20.3995 22.3737 34.4255 40.8005 60.2755 19.5562 9.2060 29.0788 16.3386
Find Optimal Number of Trees for GAM Using
Train a cross-validated generalized additive model (GAM) with 10 folds. Then, use
kfoldLoss to compute the cumulative cross-validation regression loss (mean squared errors). Use the errors to determine the optimal number of trees per predictor (linear term for predictor) and the optimal number of trees per interaction term.
patients data set.
Create a table that contains the predictor variables (
SelfAssessedHealthStatus) and the response variable (
tbl = table(Age,Diastolic,Smoker,Weight,Gender,SelfAssessedHealthStatus,Systolic);
Create a cross-validated GAM by using the default cross-validation option. Specify the
'CrossVal' name-value argument as
'on'. Also, specify to include 5 interaction terms.
rng('default') % For reproducibility CVMdl = fitrgam(tbl,'Systolic','CrossVal','on','Interactions',5);
If you specify
kfoldLoss, then the function returns cumulative errors, which are the average errors across all folds obtained using the same number of trees for each fold. Display the number of trees for each fold.
ans = struct with fields: PredictorTrees: [300 300 300 300 300 300 300 300 300 300] InteractionTrees: [76 100 100 100 100 42 100 100 59 100]
kfoldLoss can compute cumulative errors using up to 300 predictor trees and 42 interaction trees.
Plot the cumulative, 10-fold cross-validated, mean squared errors. Specify
false to exclude interaction terms from the computation.
L_noInteractions = kfoldLoss(CVMdl,'Mode','cumulative','IncludeInteractions',false); figure plot(0:min(CVMdl.NumTrainedPerFold.PredictorTrees),L_noInteractions)
The first element of
L_noInteractions is the average error over all folds obtained using only the intercept (constant) term. The (
J+1)th element of
L_noInteractions is the average error obtained using the intercept term and the first
J predictor trees per linear term. Plotting the cumulative loss allows you to monitor how the error changes as the number of predictor trees in the GAM increases.
Find the minimum error and the number of predictor trees used to achieve the minimum error.
[M,I] = min(L_noInteractions)
M = 28.0506
I = 6
The GAM achieves the minimum error when it includes 5 predictor trees.
Compute the cumulative mean squared error using both linear terms and interaction terms.
L = kfoldLoss(CVMdl,'Mode','cumulative'); figure plot(0:min(CVMdl.NumTrainedPerFold.InteractionTrees),L)
The first element of
L is the average error over all folds obtained using the intercept (constant) term and all predictor trees per linear term. The (
J+1)th element of
L is the average error obtained using the intercept term, all predictor trees per linear term, and the first
J interaction trees per interaction term. The plot shows that the error increases when interaction terms are added.
If you are satisfied with the error when the number of predictor trees is 5, you can create a predictive model by training the univariate GAM again and specifying
'NumTreesPerPredictor',5 without cross-validation.
CVMdl — Cross-validated partitioned regression model
RegressionPartitionedModel object |
RegressionPartitionedEnsemble object |
RegressionPartitionedGAM object |
RegressionPartitionedGP object |
RegressionPartitionedNeuralNetwork object |
Cross-validated partitioned regression model, specified as a
RegressionPartitionedSVM object. You can create the object in two ways:
Pass a trained regression model listed in the following table to its
Train a regression model using a function listed in the following table and specify one of the cross-validation name-value arguments for the function.
Specify optional pairs of arguments as
the argument name and
Value is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name in quotes.
kfoldLoss(CVMdl,'Folds',[1 2 3 5]) specifies to use the
first, second, third, and fifth folds to compute the mean squared error, but to exclude the
Folds — Fold indices to use
1:CVMdl.KFold (default) | positive integer vector
Fold indices to use, specified as a positive integer vector. The elements of
Folds must be within the range from
The software uses only the folds specified in
'Folds',[1 4 10]
IncludeInteractions — Flag to include interaction terms
Flag to include interaction terms of the model, specified as
false. This argument is valid only for a generalized
additive model (GAM). That is, you can specify this argument only when
The default value is
true if the models in
interaction terms. The value must be
false if the models do not
contain interaction terms.
LossFun — Loss function
'mse' (default) | function handle
Loss function, specified as
'mse' or a function handle.
Specify the built-in function
'mse'. In this case, the loss function is the mean squared error.
Specify your own function using function handle notation.
Assume that n is the number of observations in the training data (
CVMdl.NumObservations). Your function must have the signature
lossvalue =, where:
The output argument
lossvalueis a scalar.
You specify the function name (
Yis an n-by-1 numeric vector of observed responses.
Yfitis an n-by-1 numeric vector of predicted responses.
Wis an n-by-1 numeric vector of observation weights.
Specify your function using
Mode — Aggregation level for output
'average' (default) |
Aggregation level for the output, specified as
|The output is a scalar average over all folds.|
|The output is a vector of length k containing one value per fold, where k is the number of folds.|
If you want to specify this value,
PredictionForMissingValue — Predicted response value to use for observations with missing predictor values
"omitted" | numeric scalar
Predicted response value to use for observations with missing predictor values,
"omitted", or a numeric scalar. This argument is valid only for a
Gaussian process regression, neural network, or support vector machine model. That is,
you can specify this argument only when
CVMdl is a
This value is
the default when
If an observation is missing all predictor values, an observed response value, or
an observation weight, then
kfoldLoss does not use the
observation in the loss computation.
L — Loss
numeric scalar | numeric column vector
Loss, returned as a numeric scalar or numeric column vector.
By default, the loss is the mean squared error between the validation-fold observations and the predictions made with a regression model trained on the training-fold observations.
Lis the average loss over all folds.
Lis a k-by-1 numeric column vector containing the loss for each fold, where k is the number of folds.
min(CVMdl.NumTrainedPerFold)-by-1 numeric column vector. Each element
jis the average loss over all folds that the function obtains using ensembles trained with weak learners
RegressionPartitionedGAM, then the output value depends on the
(1 + min(NumTrainedPerFold.PredictorTrees))-by-1 numeric column vector. The first element of
Lis the average loss over all folds that is obtained using only the intercept (constant) term. The
(j + 1)th element of
Lis the average loss obtained using the intercept term and the first
jpredictor trees per linear term.
(1 + min(NumTrainedPerFold.InteractionTrees))-by-1 numeric column vector. The first element of
Lis the average loss over all folds that is obtained using the intercept (constant) term and all predictor trees per linear term. The
(j + 1)th element of
Lis the average loss obtained using the intercept term, all predictor trees per linear term, and the first
jinteraction trees per interaction term.
If you want to compute the cross-validated loss of a tree model, you can avoid
RegressionPartitionedModel object by calling
cvloss. Creating a cross-validated tree object can save you time if you plan to
examine it more than once.
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.
Usage notes and limitations:
For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
Version HistoryIntroduced in R2011a
R2023b: Specify predicted response value to use for observations with missing predictor values
Starting in R2023b, when you predict or compute the loss, some regression models allow you to specify the predicted response value for observations with missing predictor values. Specify the
PredictionForMissingValue name-value argument to use a numeric scalar, the training set median, or the training set mean as the predicted value. When computing the loss, you can also specify to omit observations with missing predictor values.
This table lists the object functions that support the
PredictionForMissingValue name-value argument. By default, the
functions use the training set median as the predicted response value for observations with
missing predictor values.
|Model Type||Model Objects||Object Functions|
|Gaussian process regression (GPR) model|
|Gaussian kernel regression model|
|Linear regression model|
|Neural network regression model|
|Support vector machine (SVM) regression model|
In previous releases, the regression model
predict functions listed above used
NaN predicted response values for observations with missing predictor values. The software omitted observations with missing predictor values from the resubstitution ("resub") and cross-validation ("kfold") computations for prediction and loss.
R2023a: GPU support for
Starting in R2023a,
kfoldLoss fully supports GPU arrays for