featureSelectionRegressionNCAComponent

Pipeline component for performing feature selection using neighborhood component analysis (NCA) for regression

Since R2026a

Description

featureSelectionRegressionNCAComponent is a pipeline component that performs feature selection using neighborhood component analysis (NCA) for regression. The pipeline component uses the functionality of the fsrnca function during the learn phase to identify important predictors in the data. During the run phase, the component selects the same predictors from a new data set.

Creation

Syntax

component = featureSelectionRegressionNCAComponent

component = featureSelectionRegressionNCAComponent(Name=Value)

Description

component = featureSelectionRegressionNCAComponent creates a pipeline component for feature selection using an NCA feature selection model. Use the component when creating a pipeline for regression.

component = featureSelectionRegressionNCAComponent(Name=Value) sets writable Properties using one or more name-value arguments. For example, you can specify the regularization parameter, solver type, and method used for model fitting.

example

Properties

expand all

Structural Parameters

The software sets structural parameters when you create the component. You cannot modify structural parameters after creating the component.

`UseWeights` — Observation weights flag
`false` or `0` (default) | `true` or `1`

This property is read-only after the component is created.

Observation weights flag, specified as 0 (false) or 1 (true). If UseWeights is true, the component adds a third input "Weights" to the Inputs component property, and a third input tag 3 to the InputTags component property.

Example: c = featureSelectionRegressionNCAComponent(UseWeights=1)

Data Types: logical

Learn Parameters

The software sets learn parameters when you create the component. You can modify learn parameters using dot notation any time before you use the learn object function. Any unset learn parameters use the corresponding default values.

`Epsilon` — Epsilon value
nonnegative real scalar

Epsilon value, specified as a nonnegative scalar.

This property is valid only when LossFunction is "epsiloninsensitive".

The default value is iqr(Y)/13.49, where Y is the second data argument of learn. This value is an estimate of the sample standard deviation using the interquartile range of the response variable.

Example: c = featureSelectionRegressionNCAComponent(Epsilon=0.1)

Example: c.Epsilon = 0.05

Data Types: single | double

`FitMethod` — Method for fitting the model
`"exact"` (default) | `"none"` | `"average"`

Method for fitting the model, specified as one of the following:

"exact" — Performs fitting using all of the data
"none" — No fitting. Use this option to evaluate the generalization error of the NCA model using the initial feature weights.
"average" — Divides the data into partitions (subsets), fits each partition using the exact method, and returns the average of the feature weights.

Example: c = featureSelectionRegressionNCAComponent(FitMethod="none")

Example: c.FitMethod = "average"

Data Types: char | string

`GradientTolerance` — Relative convergence tolerance
`1e-6` (default) | positive real scalar

Relative convergence tolerance on the gradient norm, specified as a positive real scalar.

This property is valid only when Solver is "lbfgs".

Example: c = featureSelectionRegressionNCAComponent(GradientTolerance=2e-6)

Example: c.GradientTolerance = 1e-5

Data Types: single | double

`HessianHistorySize` — Size of history buffer for Hessian approximation
`15` (default) | positive integer

Size of the history buffer for Hessian approximation, specified as a positive integer. At each iteration, the component uses the most recent HessianHistorySize iterations to build an approximation of the inverse Hessian.

This property is valid only when Solver is "lbfgs".

Example: c = featureSelectionRegressionNCAComponent(HessianHistorySize=20)

Example: c.HessianHistorySize = 10

Data Types: single | double

`InitialLearningRate` — Initial learning rate
`"auto"` (default) | positive real scalar

Initial learning rate, specified as a positive real scalar or "auto".

When Solver is "sgd", the learning rate decays over iterations starting with the value specified for InitialLearningRate.

When you specify "auto", the initial learning rate is determined using experiments on small subsets of data. Use the NumTuningIterations property to specify the number of iterations for automatically tuning the initial learning rate. Use the TuningSubsetSize property to specify the number of observations to use for automatically tuning the initial learning rate.

For solver type "minibatch-lbfgs", you can set InitialLearningRate to a very high value. In this case, the function applies LBFGS to each mini-batch separately with initial feature weights from the previous mini-batch.

Example: c = featureSelectionRegressionNCAComponent(InitialLearningRate=0.9)

Example: c.InitialLearningRate = "auto"

Data Types: single | double | char | string

`InitialStepSize` — Initial step size
`"auto"` (default) | positive real scalar

Initial step size, specified as a positive real scalar or "auto".

This property is valid only when Solver is "lbfgs".

Example: c = featureSelectionRegressionNCAComponent(InitialStepSize=0.1)

Example: c.InitialStepSize = "auto"

Data Types: single | double | char | string

`IterationLimit` — Maximum number of iterations
positive integer

Maximum number of iterations, specified as a positive integer.

Each pass through a batch is an iteration. Each pass through all of the data is an epoch. If the data is divided into k mini-batches, then every epoch is equivalent to k iterations.

If Solver is "sgd", the default value is 10000. If Solver is "lbfgs" or "minibatch-lbfgs", the default value is 1000

Example: c = featureSelectionRegressionNCAComponent(IterationLimit=250)

Example: c.IterationLimit = 1000

Data Types: single | double

`Lambda` — Regularization parameter
nonnegative scalar

Regularization parameter, specified as a nonnegative scalar.

As the number of observations increases, the chance of overfitting decreases and the required amount of regularization also decreases.

The default value is 1/n, where n is the number of observations in the first data argument of learn.

Example: c = featureSelectionRegressionNCAComponent(Lambda=0.002)

Example: c.Lambda = 0.01

Data Types: single | double

`LengthScale` — Width of kernel
`1` (default) | positive real scalar

Width of kernel, specified as a positive real scalar.

A length scale value of 1 is sensible when all predictors are on the same scale. If the predictors are of very different magnitudes, then consider standardizing the predictor values using the Standardize property.

Example: c = featureSelectionRegressionNCAComponent(LengthScale=1.5)

Example: c.LengthScale = 1.25

Data Types: single | double

`LineSearchMethod` — Line search method
`"weakwolfe"` (default) | `"strongwolfe"` | `"backtracking"`

Line search method, specified as one of the following:

"weakwolfe" — Weak Wolfe line search
"strongwolfe" — Strong Wolfe line search
"backtracking" — Backtracking line search

This property is valid only when Solver is "lbfgs".

Example: c = featureSelectionRegressionNCAComponent(LineSearchMethod="strongwolfe")

Example: c.LineSearchMethod = "backtracking"

Data Types: char | string

`LossFunction` — Loss function
`"mad"` (default) | `"mse"` | `"epsiloninsensitive"` | function handle

Loss function, specified as one of the following:

"mad" — Mean absolute deviation
"mse" — Mean squared error
"epsiloninsensitive" — ε-insensitive loss function
Function handle — Custom loss function handle. The function must have the form L = lossfun(Yu,Yv), where Yu is a u-by-1 vector, Yv is a v-by-1 vector, and L is a u-by-v matrix of loss values.

Example: c = featureSelectionRegressionNCAComponent(LossFunction = "mse")

Example: c.LossFunction = "epsiloninsensitive"

Data Types: char | string | function_handle

`MaxLineSearchIterations` — Maximum number of line search iterations
`20` (default) | positive integer

Maximum number of line search iterations, specified as a positive integer.

This property is valid only when Solver is "lbfgs".

Example: c = featureSelectionRegressionNCAComponent(MaxLineSearchIterations=25)

Example: c.MaxLineSearchIterations = 15

Data Types: single | double

`MaxWeightFraction` — Max weight fraction for selecting features
numeric scalar in the range (0,1]

Max weight fraction for selecting features, specified as a numeric scalar in the range (0,1].

If you do not specify the NumFeatures or MaxWeightFraction value, the software selects all features. You cannot specify both NumFeatures and MaxWeightFraction.

Example: c = featureSelectionRegressionNCAComponent(MaxWeightFraction=0.5)

Example: c.MaxWeightFraction = 0.75

Data Types: single | double

`MiniBatchLBFGSIterations` — Maximum number of iterations per mini-batch LBFGS step
`10` (default) | positive integer

Maximum number of iterations per mini-batch LBFGS step, specified as a positive integer.

This property is valid only when Solver is "minibatch-lbfgs".

Example: c = featureSelectionRegressionNCAComponent(MiniBatchLBFGSIterations=15)

Example: c.MiniBatchLBFGSIterations = 20

Data Types: single | double

`MiniBatchSize` — Number of observations to use in each batch
positive integer

Number of observations to use in each batch, specified as a positive integer between 1 and n, where n is the number of observations in the first data argument of learn.

This property is valid only when Solver is "sgd".

The default value is min(10,n).

Example: c = featureSelectionRegressionNCAComponent(MiniBatchSize=25)

Example: c.MiniBatchSize = 20

Data Types: single | double

`NumFeatures` — Number of features to select
positive integer scalar

Number of features (predictors) to select, specified as a positive integer scalar.

If you do not specify the NumFeatures or MaxWeightFraction value, the software selects all features. You cannot specify both NumFeatures and MaxWeightFraction.

Example: c = featureSelectionRegressionNCAComponent(NumFeatures=5)

Example: c.NumFeatures = 10

Data Types: single | double

`NumTuningIterations` — Number of tuning iterations
`20` (default) | positive integer

Number of tuning iterations, specified as a positive integer.

This property is valid only when Solver is "sgd" and InitialLearningRate is "auto".

Example: c = featureSelectionRegressionNCAComponent(NumTuningIterations=15)

Example: c.NumTuningIterations = 25

Data Types: single | double

`PassLimit` — Maximum number of passes
`5` (default) | positive integer

Maximum number of passes, specified as a positive integer. Each pass through all of the data is called an epoch.

This property is valid only when Solver is "sgd".

Example: c = featureSelectionRegressionNCAComponent(PassLimit=10)

Example: c.PassLimit = 3

Data Types: single | double

`Solver` — Solver type
`"lbfgs"` | `"sgd"` | `"minibatch-lbfgs"`

Solver type for estimating feature weights, specified as one of the following:

"lbfgs" — Limited memory Broyden–Fletcher–Goldfarb–Shanno (LBFGS) algorithm
"sgd" — Stochastic gradient descent (SGD) algorithm
"minibatch-lbfgs" — Stochastic gradient descent with LBFGS algorithm applied to mini-batches

The default value is "sgd" when n>1000, where n is the number of observations in the first data argument of learn. Otherwise, the default value is "lbfgs".

Example: c = featureSelectionRegressionNCAComponent(Solver="sgd")

Example: c.Solver = "lbfgs"

Data Types: char | string

`Standardize` — Indicator for standardizing predictor data
`false` or `0` (default) | `true` or `1`

Indicator for standardizing predictor data, specified as 0 (false) or 1 (true).

Example: c = featureSelectionRegressionNCAComponent(Standardized=true)

Example: c.Standardize = false

Data Types: logical

`StepTolerance` — Convergence tolerance on step size
`1e-6` (default) | positive real scalar

Convergence tolerance on the step size, specified as a positive real scalar.

This property is valid only when Solver is "sgd" or "lbfgs".

The "lbfgs" solver uses an absolute step tolerance, and the "sgd" solver uses a relative step tolerance.

Example: c = featureSelectionRegressionNCAComponent(StepTolerance=5e-6)

Example: c.StepTolerance = 1e-5

Data Types: single | double

`TuningSubsetSize` — Number of observations to use for tuning initial learning rate
positive integer value

Number of observations to use for tuning the initial learning rate, specified as a positive integer value from 1 to n, where n is the number of observations in the first data argument of learn.

This property is valid only when Solver is "sgd" and InitialLearningRate is "auto".

The default value is min(100,n).

Example: c = featureSelectionRegressionNCAComponent(TuningSubsetSize=25)

Example: c.TuningSubsetSize = 50

Data Types: single | double

Component Properties

The software sets component properties when you create the component. You can modify the component properties (excluding HasLearnables and HasLearned) using dot notation at any time. You cannot modify the HasLearnables and HasLearned properties directly.

`Name` — Component identifier
`"FeatureSelectionRegressionNCA"` (default) | character vector | string scalar

Component identifier, specified as a character vector or string scalar.

Example: c = featureSelectionRegressionNCAComponent(Name="FeatureSelector")

Example: c.Name = "NCASelector"

Data Types: char | string

`Inputs` — Names of input ports
`["X","Y"]` (default) | character vector | string array | cell array of character vectors

Names of the input ports, specified as a character vector, string array, or cell array of character vectors. If UseWeights is true, the component adds the input port "W" to Inputs.

Example: c = featureSelectionRegressionNCAComponent(Inputs=["Data1","Data2"])

Example: c.Inputs = ["X1","Y1"]

Data Types: char | string | cell

`Outputs` — Names of output ports
`["XSelected","Scores"]` (default) | character vector | string array | cell array of character vectors

Names of the output ports, specified as a character vector, string array, or cell array of character vectors.

Example: c = featureSelectionRegressionNCAComponent(Outputs=["newX","importance"])

Example: c.Outputs = ["X","S"]

Data Types: char | string | cell

`InputTags` — Tags that enable automatic connection of component inputs
`[1 2]` (default) | nonnegative integer vector

Tags that enable the automatic connection of the component inputs with other components or pipelines, specified as a nonnegative integer vector. If you specify InputTags, then the number of tags must match the number of inputs in Inputs. If UseWeights is true, the software adds a third input tag to InputTags.

Example: c = featureSelectionRegressionNCAComponent(InputTags=[1 0])

Example: c.InputTags = [1 2]

Data Types: single | double

`OutputTags` — Tags that enable automatic connection of component outputs
`[1 NaN]` (default) | nonnegative integer vector

Tags that enable the automatic connection of the component outputs with other components or pipelines, specified as a nonnegative integer vector. If you specify OutputTags, then the number of tags must match the number of outputs in Outputs.

Example: c = featureSelectionRegressionNCAComponent(OutputTags=[1 0])

Example: c.OutputTags=[1 2]

Data Types: single | double

`HasLearnables` — Indicator for learnables
Read-only: `1` (`true`) (default)

This property is read-only.

Indicator for the learnables, returned as 1 (true). A value of 1 indicates that the component contains Learnables.

Data Types: logical

`HasLearned` — Indicator showing learning status of component
Read-only: `0` (`false`) (default) | `1` (`true`)

This property is read-only.

Indicator showing the learning status of the component, returned as 0 (false) or 1 (true). A value of 1 indicates that the learn object function has been applied to the component and the Learnables are nonempty.

Data Types: logical

Learnables

The software sets learnables when you use the learn object function. You cannot modify learnables directly.

`Model` — Neighborhood component analysis model for regression
Read-only: `FeatureSelectionNCARegression` object

This property is read-only.

Neighborhood component analysis model for regression, returned as a FeatureSelectionNCARegression model object.

`SelectedVariables` — Names of features selected by component
Read-only: string array | `[]`

This property is read-only.

Names of the features selected by the component, returned as a string array. The features correspond to columns in the first data argument of learn.

Data Types: string

`UsedVariables` — Names of variables used by component
Read-only: string array | `[]`

This property is read-only.

Names of the variables used by the component to select features, returned as a string array. The variables correspond to columns in the first data argument of learn.

Data Types: string

Object Functions

`learn`	Initialize and evaluate pipeline or component
`run`	Execute pipeline or component for inference after learning
`reset`	Reset pipeline or component
`series`	Connect components in series to create pipeline
`parallel`	Connect components or pipelines in parallel to create pipeline
`view`	View diagram of pipeline inputs, outputs, components, and connections

Examples

collapse all

Create and Learn Pipeline Component for NCA Feature Selection

Create a featureSelectionRegressionNCAComponent pipeline component. Specify to select 5 features.

component = featureSelectionRegressionNCAComponent(NumFeatures=5)

component = 
  featureSelectionRegressionNCAComponent with properties:

                 Name: "FeatureSelectionRegressionNCA"
               Inputs: ["X"    "Y"]
            InputTags: [1 2]
              Outputs: ["XSelected"    "Scores"]
           OutputTags: [1 NaN]

   
Learnables (HasLearned = false)
                Model: []
    SelectedVariables: []
        UsedVariables: []

   
Structural Parameters (locked)
           UseWeights: 0

   
Learn Parameters (unlocked)
          NumFeatures: 5


Show all parameters

component is a featureSelectionRegressionNCAComponent object that contains three learnables: Model, SelectedVariables, and UsedVariables. These properties remains empty until you pass data to the component during the learn phase.

Load the carbig data set. Store the predictor and response data in the tables X and Y, respectively.

load carbig
X = table(Acceleration,Cylinders,Displacement, ...
    Horsepower,Model_Year,Weight,Origin);
Y = table(MPG);

Use the learn object function to select features from the predictor data X.

component = learn(component,X,Y)

component = 
  featureSelectionRegressionNCAComponent with properties:

                 Name: "FeatureSelectionRegressionNCA"
               Inputs: ["X"    "Y"]
            InputTags: [1 2]
              Outputs: ["XSelected"    "Scores"]
           OutputTags: [1 NaN]

   
Learnables (HasLearned = true)
                Model: [1×1 FeatureSelectionNCARegression]
    SelectedVariables: ["Model_Year"    "Acceleration"    "Horsepower"    "Displacement"    "Cylinders"]
        UsedVariables: ["Acceleration"    "Cylinders"    "Displacement"    "Horsepower"    "Model_Year"    "Weight"]

   
Structural Parameters (locked)
           UseWeights: 0

   
Learn Parameters (locked)
          NumFeatures: 5


Show all parameters

Note that the HasLearned property is set to true and Model, SelectedVariables, and UsedVariables are nonempty.

Find the names of the selected features.

names = component.SelectedVariables

names = 

  1×5 string array

    "Model_Year"    "Acceleration"    "Horsepower"    "Displacement"    "Cylinders"

Version History

Introduced in R2026a

featureSelectionRegressionNCAComponent

Description

Creation

Syntax

Description

Properties

Structural Parameters

UseWeights — Observation weights flag false or 0 (default) | true or 1

Learn Parameters

Epsilon — Epsilon value nonnegative real scalar

FitMethod — Method for fitting the model "exact" (default) | "none" | "average"

GradientTolerance — Relative convergence tolerance 1e-6 (default) | positive real scalar

HessianHistorySize — Size of history buffer for Hessian approximation 15 (default) | positive integer

InitialLearningRate — Initial learning rate "auto" (default) | positive real scalar

InitialStepSize — Initial step size "auto" (default) | positive real scalar

IterationLimit — Maximum number of iterations positive integer

Lambda — Regularization parameter nonnegative scalar

LengthScale — Width of kernel 1 (default) | positive real scalar

LineSearchMethod — Line search method "weakwolfe" (default) | "strongwolfe" | "backtracking"

LossFunction — Loss function "mad" (default) | "mse" | "epsiloninsensitive" | function handle

MaxLineSearchIterations — Maximum number of line search iterations 20 (default) | positive integer

MaxWeightFraction — Max weight fraction for selecting features numeric scalar in the range (0,1]

MiniBatchLBFGSIterations — Maximum number of iterations per mini-batch LBFGS step 10 (default) | positive integer

MiniBatchSize — Number of observations to use in each batch positive integer

NumFeatures — Number of features to select positive integer scalar

NumTuningIterations — Number of tuning iterations 20 (default) | positive integer

PassLimit — Maximum number of passes 5 (default) | positive integer

Solver — Solver type "lbfgs" | "sgd" | "minibatch-lbfgs"

Standardize — Indicator for standardizing predictor data false or 0 (default) | true or 1

StepTolerance — Convergence tolerance on step size 1e-6 (default) | positive real scalar

TuningSubsetSize — Number of observations to use for tuning initial learning rate positive integer value

Component Properties

Name — Component identifier "FeatureSelectionRegressionNCA" (default) | character vector | string scalar

Inputs — Names of input ports ["X","Y"] (default) | character vector | string array | cell array of character vectors

Outputs — Names of output ports ["XSelected","Scores"] (default) | character vector | string array | cell array of character vectors

InputTags — Tags that enable automatic connection of component inputs [1 2] (default) | nonnegative integer vector

OutputTags — Tags that enable automatic connection of component outputs [1 NaN] (default) | nonnegative integer vector

HasLearnables — Indicator for learnables Read-only: 1 (true) (default)

HasLearned — Indicator showing learning status of component Read-only: 0 (false) (default) | 1 (true)

Learnables

Model — Neighborhood component analysis model for regression Read-only: FeatureSelectionNCARegression object

SelectedVariables — Names of features selected by component Read-only: string array | []

UsedVariables — Names of variables used by component Read-only: string array | []

Object Functions

Examples

Create and Learn Pipeline Component for NCA Feature Selection

Version History

See Also

Topics

`UseWeights` — Observation weights flag
`false` or `0` (default) | `true` or `1`

`Epsilon` — Epsilon value
nonnegative real scalar

`FitMethod` — Method for fitting the model
`"exact"` (default) | `"none"` | `"average"`

`GradientTolerance` — Relative convergence tolerance
`1e-6` (default) | positive real scalar

`HessianHistorySize` — Size of history buffer for Hessian approximation
`15` (default) | positive integer

`InitialLearningRate` — Initial learning rate
`"auto"` (default) | positive real scalar

`InitialStepSize` — Initial step size
`"auto"` (default) | positive real scalar

`IterationLimit` — Maximum number of iterations
positive integer

`Lambda` — Regularization parameter
nonnegative scalar

`LengthScale` — Width of kernel
`1` (default) | positive real scalar

`LineSearchMethod` — Line search method
`"weakwolfe"` (default) | `"strongwolfe"` | `"backtracking"`

`LossFunction` — Loss function
`"mad"` (default) | `"mse"` | `"epsiloninsensitive"` | function handle

`MaxLineSearchIterations` — Maximum number of line search iterations
`20` (default) | positive integer

`MaxWeightFraction` — Max weight fraction for selecting features
numeric scalar in the range (0,1]

`MiniBatchLBFGSIterations` — Maximum number of iterations per mini-batch LBFGS step
`10` (default) | positive integer

`MiniBatchSize` — Number of observations to use in each batch
positive integer

`NumFeatures` — Number of features to select
positive integer scalar

`NumTuningIterations` — Number of tuning iterations
`20` (default) | positive integer

`PassLimit` — Maximum number of passes
`5` (default) | positive integer

`Solver` — Solver type
`"lbfgs"` | `"sgd"` | `"minibatch-lbfgs"`

`Standardize` — Indicator for standardizing predictor data
`false` or `0` (default) | `true` or `1`

`StepTolerance` — Convergence tolerance on step size
`1e-6` (default) | positive real scalar

`TuningSubsetSize` — Number of observations to use for tuning initial learning rate
positive integer value

`Name` — Component identifier
`"FeatureSelectionRegressionNCA"` (default) | character vector | string scalar

`Inputs` — Names of input ports
`["X","Y"]` (default) | character vector | string array | cell array of character vectors

`Outputs` — Names of output ports
`["XSelected","Scores"]` (default) | character vector | string array | cell array of character vectors

`InputTags` — Tags that enable automatic connection of component inputs
`[1 2]` (default) | nonnegative integer vector

`OutputTags` — Tags that enable automatic connection of component outputs
`[1 NaN]` (default) | nonnegative integer vector

`HasLearnables` — Indicator for learnables
Read-only: `1` (`true`) (default)

`HasLearned` — Indicator showing learning status of component
Read-only: `0` (`false`) (default) | `1` (`true`)

`Model` — Neighborhood component analysis model for regression
Read-only: `FeatureSelectionNCARegression` object

`SelectedVariables` — Names of features selected by component
Read-only: string array | `[]`

`UsedVariables` — Names of variables used by component
Read-only: string array | `[]`