Main Content

loss

Regression loss for Gaussian kernel regression model

Description

example

L = loss(Mdl,X,Y) returns the mean squared error (MSE) for the Gaussian kernel regression model Mdl using the predictor data in X and the corresponding responses in Y.

L = loss(Mdl,Tbl,ResponseVarName) returns the MSE for the model Mdl using the predictor data in Tbl and the true responses in Tbl.ResponseVarName.

L = loss(Mdl,Tbl,Y) returns the MSE for the model Mdl using the predictor data in table Tbl and the true responses in Y.

example

L = loss(___,Name,Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can specify a regression loss function and observation weights. Then, loss returns the weighted regression loss using the specified loss function.

Examples

collapse all

Train a Gaussian kernel regression model for a tall array, then calculate the resubstitution mean squared error and epsilon-insensitive error.

When you perform calculations on tall arrays, MATLAB® uses either a parallel pool (default if you have Parallel Computing Toolbox™) or the local MATLAB session. To run the example using the local MATLAB session when you have Parallel Computing Toolbox, change the global execution environment by using the mapreducer function.

mapreducer(0)

Create a datastore that references the folder location with the data. The data can be contained in a single file, a collection of files, or an entire folder. Treat 'NA' values as missing data so that datastore replaces them with NaN values. Select a subset of the variables to use. Create a tall table on top of the datastore.

varnames = {'ArrTime','DepTime','ActualElapsedTime'};
ds = datastore('airlinesmall.csv','TreatAsMissing','NA',...
    'SelectedVariableNames',varnames);
t = tall(ds);

Specify DepTime and ArrTime as the predictor variables (X) and ActualElapsedTime as the response variable (Y). Select the observations for which ArrTime is later than DepTime.

daytime = t.ArrTime>t.DepTime;
Y = t.ActualElapsedTime(daytime);     % Response data
X = t{daytime,{'DepTime' 'ArrTime'}}; % Predictor data

Standardize the predictor variables.

Z = zscore(X); % Standardize the data

Train a default Gaussian kernel regression model with the standardized predictors. Set 'Verbose',0 to suppress diagnostic messages.

[Mdl,FitInfo] = fitrkernel(Z,Y,'Verbose',0)
Mdl = 
  RegressionKernel
              ResponseName: 'Y'
                   Learner: 'svm'
    NumExpansionDimensions: 64
               KernelScale: 1
                    Lambda: 8.5385e-06
             BoxConstraint: 1
                   Epsilon: 5.9303


FitInfo = struct with fields:
                  Solver: 'LBFGS-tall'
            LossFunction: 'epsiloninsensitive'
                  Lambda: 8.5385e-06
           BetaTolerance: 1.0000e-03
       GradientTolerance: 1.0000e-05
          ObjectiveValue: 26.1409
       GradientMagnitude: 0.0023
    RelativeChangeInBeta: 0.0150
                 FitTime: 32.5816
                 History: []

Mdl is a trained RegressionKernel model, and the structure array FitInfo contains optimization details.

Determine how well the trained model generalizes to new predictor values by estimating the resubstitution mean squared error and epsilon-insensitive error.

lossMSE = loss(Mdl,Z,Y) % Resubstitution mean squared error
lossMSE =

  MxNx... tall array

    ?    ?    ?    ...
    ?    ?    ?    ...
    ?    ?    ?    ...
    :    :    :
    :    :    :
lossEI = loss(Mdl,Z,Y,'LossFun','epsiloninsensitive') % Resubstitution epsilon-insensitive error
lossEI =

  MxNx... tall array

    ?    ?    ?    ...
    ?    ?    ?    ...
    ?    ?    ?    ...
    :    :    :
    :    :    :

Evaluate the tall arrays and bring the results into memory by using gather.

[lossMSE,lossEI] = gather(lossMSE,lossEI)
Evaluating tall expression using the Local MATLAB Session:
- Pass 1 of 1: Completed in 1.7 sec
Evaluation completed in 2 sec
lossMSE = 2.5141e+03
lossEI = 25.5148

Specify a custom regression loss (Huber loss) for a Gaussian kernel regression model.

Load the carbig data set.

load carbig

Specify the predictor variables (X) and the response variable (Y).

X = [Weight,Cylinders,Horsepower,Model_Year];
Y = MPG;

Delete rows of X and Y where either array has NaN values. Removing rows with NaN values before passing data to fitrkernel can speed up training and reduce memory usage.

R = rmmissing([X Y]); 
X = R(:,1:4); 
Y = R(:,end); 

Reserve 10% of the observations as a holdout sample. Extract the training and test indices from the partition definition.

rng(10)  % For reproducibility
N = length(Y);
cvp = cvpartition(N,'Holdout',0.1);
idxTrn = training(cvp); % Training set indices
idxTest = test(cvp);    % Test set indices

Train the regression kernel model. Standardize the training data.

Xtrain = X(idxTrn,:);
Ytrain = Y(idxTrn);
Mdl = fitrkernel(Xtrain,Ytrain,'Standardize',true)
Mdl = 
  RegressionKernel
              ResponseName: 'Y'
                   Learner: 'svm'
    NumExpansionDimensions: 128
               KernelScale: 1
                    Lambda: 0.0028
             BoxConstraint: 1
                   Epsilon: 0.8617


Mdl is a RegressionKernel model.

Create an anonymous function that measures Huber loss (δ=1), that is,

L=1wjj=1nwjj,

where

j={0.5ejˆ2;|ejˆ|-0.5;|ejˆ|1|ejˆ|>1.

ejˆ is the residual for observation j. Custom loss functions must be written in a particular form. For rules on writing a custom loss function, see the 'LossFun' name-value argument.

huberloss = @(Y,Yhat,W)sum(W.*((0.5*(abs(Y-Yhat)<=1).*(Y-Yhat).^2) + ...
    ((abs(Y-Yhat)>1).*abs(Y-Yhat)-0.5)))/sum(W);

Estimate the training set regression loss using the Huber loss function.

eTrain = loss(Mdl,Xtrain,Ytrain,'LossFun',huberloss)
eTrain = 1.7210

Estimate the test set regression loss using the Huber loss function.

Xtest = X(idxTest,:);
Ytest = Y(idxTest);

eTest = loss(Mdl,Xtest,Ytest,'LossFun',huberloss)
eTest = 1.3062

Input Arguments

collapse all

Kernel regression model, specified as a RegressionKernel model object. You can create a RegressionKernel model object using fitrkernel.

Predictor data, specified as an n-by-p numeric matrix, where n is the number of observations and p is the number of predictors. p must be equal to the number of predictors used to train Mdl.

Data Types: single | double

Response data, specified as an n-dimensional numeric vector. The length of Y must be equal to the number of observations in X or Tbl.

Data Types: single | double

Sample data used to train the model, specified as a table. Each row of Tbl corresponds to one observation, and each column corresponds to one predictor variable. Optionally, Tbl can contain additional columns for the response variable and observation weights. Tbl must contain all the predictors used to train Mdl. Multicolumn variables and cell arrays other than cell arrays of character vectors are not allowed.

If Tbl contains the response variable used to train Mdl, then you do not need to specify ResponseVarName or Y.

If you train Mdl using sample data contained in a table, then the input data for loss must also be in a table.

Response variable name, specified as the name of a variable in Tbl. The response variable must be a numeric vector. If Tbl contains the response variable used to train Mdl, then you do not need to specify ResponseVarName.

If you specify ResponseVarName, then you must specify it as a character vector or string scalar. For example, if the response variable is stored as Tbl.Y, then specify ResponseVarName as 'Y'. Otherwise, the software treats all columns of Tbl, including Tbl.Y, as predictors.

Data Types: char | string

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: L = loss(Mdl,X,Y,'LossFun','epsiloninsensitive','Weights',weights) returns the weighted regression loss using the epsilon-insensitive loss function.

Loss function, specified as the comma-separated pair consisting of 'LossFun' and a built-in loss function name or a function handle.

  • The following table lists the available loss functions. Specify one using its corresponding character vector or string scalar. Also, in the table, f(x)=T(x)β+b.

    • x is an observation (row vector) from p predictor variables.

    • T(·) is a transformation of an observation (row vector) for feature expansion. T(x) maps x in p to a high-dimensional space (m).

    • β is a vector of m coefficients.

    • b is the scalar bias.

    ValueDescription
    'epsiloninsensitive'Epsilon-insensitive loss: [y,f(x)]=max[0,|yf(x)|ε]
    'mse'MSE: [y,f(x)]=[yf(x)]2

    'epsiloninsensitive' is appropriate for SVM learners only.

  • Specify your own function by using function handle notation.

    Let n be the number of observations in X. Your function must have this signature:

    lossvalue = lossfun(Y,Yhat,W)

    • The output argument lossvalue is a scalar.

    • You choose the function name (lossfun).

    • Y is an n-dimensional vector of observed responses. loss passes the input argument Y in for Y.

    • Yhat is an n-dimensional vector of predicted responses, which is similar to the output of predict.

    • W is an n-by-1 numeric vector of observation weights.

    Specify your function using 'LossFun',@lossfun.

Data Types: char | string | function_handle

Since R2023b

Predicted response value to use for observations with missing predictor values, specified as "median", "mean", "omitted", or a numeric scalar.

ValueDescription
"median"loss uses the median of the observed response values in the training data as the predicted response value for observations with missing predictor values.
"mean"loss uses the mean of the observed response values in the training data as the predicted response value for observations with missing predictor values.
"omitted"loss excludes observations with missing predictor values from the loss computation.
Numeric scalarloss uses this value as the predicted response value for observations with missing predictor values.

If an observation is missing an observed response value or an observation weight, then loss does not use the observation in the loss computation.

Example: "PredictionForMissingValue","omitted"

Data Types: single | double | char | string

Observation weights, specified as the comma-separated pair consisting of 'Weights' and a numeric vector or the name of a variable in Tbl.

  • If Weights is a numeric vector, then the size of Weights must be equal to the number of rows in X or Tbl.

  • If Weights is the name of a variable in Tbl, you must specify Weights as a character vector or string scalar. For example, if the weights are stored as Tbl.W, then specify Weights as 'W'. Otherwise, the software treats all columns of Tbl, including Tbl.W, as predictors.

If you supply the observation weights, loss computes the weighted regression loss, that is, the Weighted Mean Squared Error or Epsilon-Insensitive Loss Function.

loss normalizes Weights to sum to 1.

Data Types: double | single | char | string

Output Arguments

collapse all

Regression loss, returned as a numeric scalar. The interpretation of L depends on Weights and LossFun. For example, if you use the default observation weights and specify 'epsiloninsensitive' as the loss function, then L is the epsilon-insensitive loss.

More About

collapse all

Weighted Mean Squared Error

The weighted mean squared error is calculated as follows:

mse=j=1nwj(f(xj)yj)2j=1nwj,

where:

  • n is the number of observations.

  • xj is the jth observation (row of predictor data).

  • yj is the observed response to xj.

  • f(xj) is the response prediction of the Gaussian kernel regression model Mdl to xj.

  • w is the vector of observation weights.

Each observation weight in w is equal to ones(n,1)/n by default. You can specify different values for the observation weights by using the 'Weights' name-value pair argument. loss normalizes Weights to sum to 1.

Epsilon-Insensitive Loss Function

The epsilon-insensitive loss function ignores errors that are within the distance epsilon (ε) of the function value. The function is formally described as:

Lossε={0,if|yf(x)|ε|yf(x)|ε,otherwise.

The mean epsilon-insensitive loss is calculated as follows:

Loss=j=1nwjmax(0,|yjf(xj)|ε)j=1nwj,

where:

  • n is the number of observations.

  • xj is the jth observation (row of predictor data).

  • yj is the observed response to xj.

  • f(xj) is the response prediction of the Gaussian kernel regression model Mdl to xj.

  • w is the vector of observation weights.

Each observation weight in w is equal to ones(n,1)/n by default. You can specify different values for the observation weights by using the 'Weights' name-value pair argument. loss normalizes Weights to sum to 1.

Extended Capabilities

Version History

Introduced in R2018a

expand all