Hi Alessandro,
Yes, it is possible to cross-validate and estimate the generalization error for each of the individual binary models within an ECOC (Error-Correcting Output Codes) multiclass classification framework in MATLAB. However, MATLAB does not provide a direct built-in function to perform cross-validation on each individual binary model separately when using fitcecoc with the 'allpairs' coding design.
Approach:
To achieve this, you can manually extract the binary models and cross-validate each one separately. Here's how you can do it:
- Train the ECOC Model: Use fitcecoc with the 'allpairs' coding design to train your multiclass model.
- Extract Binary Models: Access the binary learners from the trained ECOC model.
- Cross-Validate Each Binary Model: Use cross-validation on each binary classifier separately.
Here is a step-by-step example:
t = templateSVM('KernelFunction', 'linear');
Mdl = fitcecoc(X, Y, 'Learners', t, 'ClassNames', {'setosa', 'versicolor', 'virginica'}, 'Coding', 'allpairs');
binaryModels = Mdl.BinaryLearners;
binaryLosses = zeros(length(binaryModels), 1);
for i = 1:length(binaryModels)
binaryModel = binaryModels{i};
classNames = binaryModel.ClassNames;
isClass = ismember(Y, classNames);
CVBinaryMdl = crossval(binaryModel, 'X', XBinary, 'Y', YBinary);
binaryLosses(i) = kfoldLoss(CVBinaryMdl);
fprintf('Binary Model %d (%s vs %s) Cross-Validation Loss: %.4f\n', i, classNames{1}, classNames{2}, binaryLosses(i));
averageBinaryLoss = mean(binaryLosses);
fprintf('Average Cross-Validation Loss for Binary Models: %.4f\n', averageBinaryLoss);
Explanation:
- Training the ECOC Model: We train the ECOC model using fitcecoc with the 'allpairs' coding design, which creates binary classifiers for each pair of classes.
- Extracting Binary Models: The binary models are accessed through Mdl.BinaryLearners.
- Cross-Validation: For each binary model, extract the relevant subset of data corresponding to the two classes involved in that binary classification, and perform cross-validation using crossval.
- Binary Loss Calculation: Calculate and print the cross-validation loss for each binary model, as well as the average loss across all binary models.