compareHoldout
Compare accuracies of two classification models using new data
Syntax
Description
compareHoldout
statistically assesses the accuracies of
two classification models. The function first compares their predicted labels against
the true labels, and then it detects whether the difference between the
misclassification rates is statistically significant.
You can determine whether the accuracies of the classification models differ or
whether one model performs better than another. compareHoldout
can
conduct several McNemar test variations,
including the asymptotic test, the exact-conditional test, and the
mid-p-value test. For cost-sensitive assessment, available tests include a chi-square test
(requires Optimization Toolbox™) and a likelihood ratio test.
returns the test decision from testing the null hypothesis that the trained
classification models h
= compareHoldout(C1
,C2
,T1
,T2
,ResponseVarName
)C1
and C2
have
equal accuracy for predicting the true class labels in the
ResponseVarName
variable. The alternative hypothesis is
that the labels have unequal accuracy.
The first classification model C1
uses the predictor data
in T1
, and the second classification model
C2
uses the predictor data in T2
.
The tables T1
and T2
must contain the
same response variable but can contain different sets of predictors. By default,
the software conducts the mid-p-value McNemar test to compare
the accuracies.
h
= 1
indicates rejecting the null
hypothesis at the 5% significance level. h
=
0
indicates not rejecting the null hypothesis at the 5%
level.
The following are examples of tests you can conduct:
Compare the accuracies of a simple classification model and a model that is more complex by passing the same set of predictor data (that is,
T1
=T2
).Compare the accuracies of two potentially different models using two potentially different sets of predictors.
Perform various types of Feature Selection. For example, you can compare the accuracy of a model trained using a set of predictors to the accuracy of one trained on a subset or different set of those predictors. You can choose the set of predictors arbitrarily, or use a feature selection technique such as PCA or sequential feature selection (see
pca
andsequentialfs
).
returns the test decision from testing the null hypothesis that the trained
classification models h
= compareHoldout(C1
,C2
,T1
,T2
,Y
)C1
and C2
have
equal accuracy for predicting the true class labels Y
. The
alternative hypothesis is that the labels have unequal accuracy.
The first classification model C1
uses the predictor data
T1
, and the second classification model
C2
uses the predictor data T2
. By
default, the software conducts the mid-p-value McNemar test
to compare the accuracies.
returns the test decision from testing the null hypothesis that the trained
classification models h
= compareHoldout(C1
,C2
,X1
,X2
,Y
)C1
and C2
have
equal accuracy for predicting the true class labels Y
. The
alternative hypothesis is that the labels have unequal accuracy.
The first classification model C1
uses the predictor data
X1
, and the second classification model
C2
uses the predictor data X2
. By
default, the software conducts the mid-p-value McNemar test
to compare the accuracies.
specifies options using one or more name-value pair arguments in addition to the
input argument combinations in previous syntaxes. For example, you can specify
the type of alternative hypothesis, specify the type of test, and supply a cost
matrix.h
= compareHoldout(___,Name,Value
)
Examples
Input Arguments
Output Arguments
Limitations
compareHoldout
does not compare ECOC models composed of linear or kernel classification models (that is,ClassificationLinear
orClassificationKernel
model objects). To compareClassificationECOC
models composed of linear or kernel classification models, usetestcholdout
instead.Similarly,
compareHoldout
does not compareClassificationLinear
orClassificationKernel
model objects. To compare these models, usetestcholdout
instead.
More About
Tips
One way to perform cost-insensitive feature selection is:
Train the first classification model (
C1
) using the full predictor set.Train the second classification model (
C2
) using the reduced predictor set.Specify
X1
as the full test-set predictor data andX2
as the reduced test-set predictor data.Enter
compareHoldout(C1,C2,X1,X2,Y,'Alternative','less')
. IfcompareHoldout
returns1
, then there is enough evidence to suggest that the classification model that uses fewer predictors performs better than the model that uses the full predictor set.
Alternatively, you can assess whether there is a significant difference between the accuracies of the two models. To perform this assessment, remove the
'Alternative','less'
specification in step 4.compareHoldout
conducts a two-sided test, andh = 0
indicates that there is not enough evidence to suggest a difference in the accuracy of the two models.Cost-sensitive tests perform numerical optimization, which requires additional computational resources. The likelihood ratio test conducts numerical optimization indirectly by finding the root of a Lagrange multiplier in an interval. For some data sets, if the root lies close to the boundaries of the interval, then the method can fail. Therefore, if you have an Optimization Toolbox license, consider conducting the cost-sensitive chi-square test instead. For more details, see
CostTest
and Cost-Sensitive Testing.
Alternative Functionality
To directly compare the accuracy of two sets of class labels
in predicting a set of true class labels, use testcholdout
.