Feature selection for regression using neighborhood component analysis (NCA)
FeatureSelectionNCARegression contains the data, fitting
information, feature weights, and other model parameters of a neighborhood
component analysis (NCA) model.
fsrnca learns the feature
weights using a diagonal adaptation of NCA and returns an instance of
FeatureSelectionNCARegression object. The
function achieves feature selection by regularizing the feature weights.
FeatureSelectionNCAClassification object using
FitMethod— Name of the fitting method used to fit this model
Name of the fitting method used to fit this model, stored as one of the following:
'exact' — Perform
fitting using all of the data.
'none' — No
fitting. Use this option to evaluate the
generalization error of the NCA model using the
initial feature weights supplied in the call to
'average' — The
software divides the data into partitions
(subsets), fits each partition using the
exact method, and returns the
average of the feature weights. You can specify
the number of partitions using the
NumPartitions name-value pair
InitialLearningRate— Initial learning rate
Initial learning rate for
learning rate decays over iterations starting at the
value specified for
TuningSubsetSize to control
the automatic tuning of initial learning rate in the
FeatureWeights— Feature weights
Feature weights, stored as a p-by-1
vector of real scalar values, where
p is the number of predictors
'FitMethod' equal to
FeatureWeights is a
matrix, where m is the number of
partitions specified via the
'NumPartitions' name-value pair
argument in the call to
The absolute value of
FeatureWeights(k) is a measure
of the importance of predictor
FeatureWeights(k) is close to
0, then this indicates that predictor
k does not influence the
|loss||Evaluate accuracy of learned feature weights on test data|
|predict||Predict responses using neighborhood component analysis (NCA) regression model|
|refit||Refit neighborhood component analysis (NCA) model for regression|
Load the sample data.
The first 15 columns contain the continuous predictor variables, whereas the 16th column contains the response variable, which is the price of a car. Define the variables for the neighborhood component analysis model.
Predictors = X(:,1:15); Y = X(:,16);
Fit a neighborhood component analysis (NCA) model for regression to detect the relevant features.
mdl = fsrnca(Predictors,Y);
The returned NCA model,
mdl, is a
FeatureSelectionNCARegression object. This object stores information about the training data, model, and optimization. You can access the object properties, such as the feature weights, using dot notation.
Plot the feature weights.
figure() plot(mdl.FeatureWeights,'ro') xlabel('Feature Index') ylabel('Feature Weight') grid on
The weights of the irrelevant features are zero. The
'Verbose',1 option in the call to
fsrnca displays the optimization information on the command line. You can also visualize the optimization process by plotting the objective function versus the iteration number.
figure() plot(mdl.FitInfo.Iteration,mdl.FitInfo.Objective,'ro-') grid on xlabel('Iteration Number') ylabel('Objective')
ModelParameters property is a
struct that contains more information about the model. You can access the fields of this property using dot notation. For example, see if the data was standardized or not.
ans = logical 0
0 means that the data was not standardized before fitting the NCA model. You can standardize the predictors when they are on very different scales using the
'Standardize',1 name-value pair argument in the call to
Value. To learn how value classes affect copy operations, see Copying Objects.