Hi Basheer,
To implement a nested cross-validation like 5 times 2-fold cross-validation in MATLAB, you'll need to manually loop through the outer and inner folds. The idea is to repeat a 2-fold cross-validation 5 times. This approach involves splitting your data into two halves, training on one half and testing on the other, and then swapping the roles of the halves. This process is repeated 5 times with different random partitions.
Here’s how you can implement this in MATLAB and obtain the confusion matrix for each run:
% Load your dataset
load fisheriris % Example dataset
X = meas; % Features
y = species; % Labels
% Define the number of outer repetitions and inner folds
outerRepeats = 5;
innerFolds = 2;
% Initialize variables to store results
confusionMatrices = cell(outerRepeats, innerFolds);
accuracies = zeros(outerRepeats, innerFolds);
% Outer loop for repetitions
for outer = 1:outerRepeats
% Create a partition for 2-fold cross-validation
cv = cvpartition(y, 'KFold', innerFolds);
% Inner loop for 2-fold cross-validation
for inner = 1:innerFolds
% Get the training and test indices for the current fold
trainIdx = training(cv, inner);
testIdx = test(cv, inner);
% Split the data into training and test sets for this fold
XTrain = X(trainIdx, :);
yTrain = y(trainIdx, :);
XTest = X(testIdx, :);
yTest = y(testIdx, :);
% Train the model on the training set
model = fitcknn(XTrain, yTrain);
% Test the model on the test set
predictions = predict(model, XTest);
% Calculate the confusion matrix for the current fold
confMat = confusionmat(yTest, predictions);
confusionMatrices{outer, inner} = confMat;
% Calculate accuracy for the current fold
accuracies(outer, inner) = sum(predictions == yTest) / length(yTest);
% Display results for the current fold
fprintf('Outer %d, Inner %d Accuracy: %.2f%%\n', outer, inner, accuracies(outer, inner) * 100);
end
end
% Calculate the average accuracy across all repetitions and folds
averageAccuracy = mean(accuracies(:));
fprintf('Average Accuracy: %.2f%%\n', averageAccuracy * 100);
% Display one of the confusion matrices as an example
disp('Example Confusion Matrix:');
disp(confusionMatrices{1, 1});
Explanation:
- Outer Loop: Run the 2-fold cross-validation outerRepeats times. This simulates different random splits to ensure robustness.
- Inner Loop: Perform 2-fold cross-validation, where you alternate between two halves of the data for training and testing.
- Model Training and Testing: Train the model on the training set and evaluate it on the test set for each fold.
- Confusion Matrix and Accuracy: Calculate and store the confusion matrix and accuracy for each fold. You can access each confusion matrix from confusionMatrices.
- Average Accuracy: Calculate the average accuracy over all runs and folds to get an overall performance measure.