Main Content

regularize

Find optimal weights for learners in regression ensemble

Description

ens1 = regularize(ens) finds optimal weights for learners in ens using lasso regularization. regularize returns a RegressionEnsemble or RegressionBaggedEnsemble model identical to ens, but with a populated Regularization property.

ens1 = regularize(ens,Name=Value) specifies additional options using one or more name-value arguments. For example, you can specify the regularization parameter values, relative tolerance on the regularization level, and maximum number of lasso optimization passes.

example

Examples

collapse all

Regularize an ensemble of bagged trees.

Generate sample data.

rng(10,"twister") % For reproducibility
X = rand(2000,20);
Y = repmat(-1,2000,1);
Y(sum(X(:,1:5),2)>2.5) = 1;

You can create a bagged classification ensemble of 300 trees from the sample data.

bag = fitrensemble(X,Y,Method="Bag",NumLearningCycles=300);

fitrensemble uses a default template tree object templateTree() as a weak learner when Method is "Bag". In this example, for reproducibility, specify Reproducible=true when you create a tree template object, and then use the object as a weak learner.

t = templateTree(Reproducible=true); % For reproducibiliy of random predictor selections
bag = fitrensemble(X,Y,Method="Bag",NumLearningCycles=300,Learners=t);

Regularize the ensemble of bagged regression trees.

bag = regularize(bag,Lambda=[0.001 0.1],Verbose=1);
Starting lasso regularization for Lambda=0.001. Initial MSE=0.109923.
    Lasso regularization completed pass 1 for Lambda=0.001
        MSE = 0.086912
        Relative change in MSE = 0.264768
        Number of learners with nonzero weights = 15
    Lasso regularization completed pass 2 for Lambda=0.001
        MSE = 0.0670602
        Relative change in MSE = 0.296029
        Number of learners with nonzero weights = 34
    Lasso regularization completed pass 3 for Lambda=0.001
        MSE = 0.0623931
        Relative change in MSE = 0.0748019
        Number of learners with nonzero weights = 51
    Lasso regularization completed pass 4 for Lambda=0.001
        MSE = 0.0605444
        Relative change in MSE = 0.0305348
        Number of learners with nonzero weights = 70
    Lasso regularization completed pass 5 for Lambda=0.001
        MSE = 0.0599666
        Relative change in MSE = 0.00963517
        Number of learners with nonzero weights = 94
    Lasso regularization completed pass 6 for Lambda=0.001
        MSE = 0.0598835
        Relative change in MSE = 0.00138719
        Number of learners with nonzero weights = 105
    Lasso regularization completed pass 7 for Lambda=0.001
        MSE = 0.0598608
        Relative change in MSE = 0.000379227
        Number of learners with nonzero weights = 113
    Lasso regularization completed pass 8 for Lambda=0.001
        MSE = 0.0598586
        Relative change in MSE = 3.72856e-05
        Number of learners with nonzero weights = 115
    Lasso regularization completed pass 9 for Lambda=0.001
        MSE = 0.0598587
        Relative change in MSE = 6.42954e-07
        Number of learners with nonzero weights = 115
    Lasso regularization completed pass 10 for Lambda=0.001
        MSE = 0.0598587
        Relative change in MSE = 4.53658e-08
        Number of learners with nonzero weights = 115
    Completed lasso minimization for Lambda=0.001.
    Resubstitution MSE changed from 0.109923 to 0.0598587.
    Number of learners reduced from 300 to 115.
Starting lasso regularization for Lambda=0.1. Initial MSE=0.109923.
    Lasso regularization completed pass 1 for Lambda=0.1
        MSE = 0.104917
        Relative change in MSE = 0.0477191
        Number of learners with nonzero weights = 12
    Lasso regularization completed pass 2 for Lambda=0.1
        MSE = 0.0851031
        Relative change in MSE = 0.232821
        Number of learners with nonzero weights = 30
    Lasso regularization completed pass 3 for Lambda=0.1
        MSE = 0.081245
        Relative change in MSE = 0.0474877
        Number of learners with nonzero weights = 40
    Lasso regularization completed pass 4 for Lambda=0.1
        MSE = 0.0796749
        Relative change in MSE = 0.0197067
        Number of learners with nonzero weights = 53
    Lasso regularization completed pass 5 for Lambda=0.1
        MSE = 0.0788411
        Relative change in MSE = 0.0105746
        Number of learners with nonzero weights = 64
    Lasso regularization completed pass 6 for Lambda=0.1
        MSE = 0.0784959
        Relative change in MSE = 0.00439793
        Number of learners with nonzero weights = 81
    Lasso regularization completed pass 7 for Lambda=0.1
        MSE = 0.0784429
        Relative change in MSE = 0.000676468
        Number of learners with nonzero weights = 88
    Lasso regularization completed pass 8 for Lambda=0.1
        MSE = 0.078447
        Relative change in MSE = 5.24449e-05
        Number of learners with nonzero weights = 88
    Completed lasso minimization for Lambda=0.1.
    Resubstitution MSE changed from 0.109923 to 0.078447.
    Number of learners reduced from 300 to 88.

regularize reports on its progress.

Inspect the resulting regularization structure.

bag.Regularization
ans = struct with fields:
               Method: 'Lasso'
       TrainedWeights: [300x2 double]
               Lambda: [1.0000e-03 0.1000]
    ResubstitutionMSE: [0.0599 0.0784]
       CombineWeights: @classreg.learning.combiner.WeightedSum

Check how many learners in the regularized ensemble have positive weights. These are the learners included in a shrunken ensemble.

sum(bag.Regularization.TrainedWeights > 0)
ans = 1×2

   115    88

Shrink the ensemble using the weights from Lambda = 0.1.

cmp = shrink(bag,weightcolumn=2)
cmp = 
  CompactRegressionEnsemble
             ResponseName: 'Y'
    CategoricalPredictors: []
        ResponseTransform: 'none'
               NumTrained: 88


The compact ensemble contains 87 members, less than 1/3 of the original 300.

Input Arguments

collapse all

Regression ensemble model, specified as a RegressionEnsemble or RegressionBaggedEnsemble model object trained with fitrensemble.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: regularize(ens,MaxIter=100,Npass=5) specifies to allow a maximum of 100 iterations to reach convergence tolerance, and a maximum of 5 passes for lasso optimization.

Regularization parameter values for lasso, specified as a vector of nonnegative scalar values. For the default setting of Lambda, regularize calculates the smallest value Lambda_max for which all optimal weights for learners are 0. The default value of Lambda is a vector including 0 and nine exponentially spaced numbers from Lambda_max/1000 to Lambda_max.

Example: Lambda=[0 0.001 0.01 0.1]

Data Types: single | double

Maximum number of iterations allowed, specified as a positive integer. If the algorithm executes MaxIter iterations before reaching the convergence tolerance, then the function stops iterating and returns a warning message. The function can return more than one warning when either Npass or the number of Lambda values is greater than 1.

Example: MaxIter=100

Data Types: single | double

Maximum number of passes for lasso optimization, specified as a positive integer.

Example: Npass=5

Data Types: single | double

Relative tolerance on the regularized loss for lasso, specified as a numeric positive scalar.

Example: Reltol=1e-4

Data Types: single | double

Verbosity level, specified as 0 or 1. When this argument is set to 1, regularize displays more information during the regularization process.

Example: Verbose=1

Data Types: single | double

More About

collapse all

Lasso

The lasso algorithm finds an optimal set of learner weights αt that minimize

n=1Nwng((t=1Tαtht(xn)),yn)+λt=1T|αt|.

Here

  • λ ≥ 0 is a parameter you provide, called the lasso parameter.

  • ht is a weak learner in the ensemble trained on N observations with predictors xn, responses yn, and weights wn.

  • g(f,y) = (fy)2 is the squared error.

Extended Capabilities

Version History

Introduced in R2011a