KFoldLoss: Get separate Loss for Classes

2 次查看(过去 30 天)
Hi,
I'm trying to compare some models using k-fold-cross-validation.
My problem is that I'd like to get the loss for my binary classes separatly (a TPR per class), as I have quite a big class-imbalance and the overall loss isn't representative for the model performance.
% Lbl and FV need to be initialized prior
Partitions = cvpartition(Lbl,'KFold',10); %Partition Data into 10 Folds
indResClass1 = zeros(10,6); %initialize Matrix of individual Fold results for Class1
%indResClass2 = zeros(10,6); %initialize Matrix of individual Fold resultsfor Class2
%Train and validate the example Models
FineKNN = fitcknn(FV,Lbl,'CVPartition',Partitions,'NumNeighbors',1,'Standardize',1);
indResClass1(1:10,1) = ones(10,1) - kfoldLoss(FineKNN,'Mode','individual');
MediumKNN = fitcknn(FV,Lbl,'CVPartition',Partitions,'NumNeighbors',10,'Standardize',1);
indResClass1(1:10,2) = ones(10,1) - kfoldLoss(MediumKNN,'Mode','individual');
CoarseKNN = fitcknn(FV,Lbl,'CVPartition',Partitions,'NumNeighbors',100,'Standardize',1);
indResClass1(1:10,3) = ones(10,1) - kfoldLoss(CoarseKNN,'Mode','individual');
CosineKNN = fitcknn(FV,Lbl,'CVPartition',Partitions,'NumNeighbors',10,'Distance','cosine','Standardize',1);
indResClass1(1:10,4) = ones(10,1) - kfoldLoss(CosineKNN,'Mode','individual');
CubicKNN = fitcknn(FV,Lbl,'CVPartition',Partitions,'NumNeighbors',10,'Distance','minkowski','Standardize',1);
indResClass1(1:10,5) = ones(10,1) - kfoldLoss(CubicKNN,'Mode','individual');
WeightedKNN = fitcknn(FV,Lbl,'CVPartition',Partitions,'NumNeighbors',10,'DistanceWeight','squaredinverse','Standardize',1);
indResClass1(1:10,6) = ones(10,1) - kfoldLoss(WeightedKNN,'Mode','individual');
%Output Boxplot
boxplot(indResClass1,'Labels',{'Fine KNN','Medium KNN','Coarse KNN','Cosine KNN',...
'Cubic KNN','Weighted KNN'})
xlabel('Trained Models')
ylabel('Cross-Validation TPR')
My code for now partitions the data, trains some partitioned example models and validates them for a boxplot output. Problem is still that I'd like to get the separate losses in two different plots (see indResClass2 which is uncommented by now).
Thanks in advance for any kind of help on this! :)

回答(1 个)

Vatsal
Vatsal 2024-5-9
Hi,
To achieve the goal of getting separate losses (or True Positive Rates, TPR) for each class in a binary classification problem using k-fold cross-validation, you can calculate the TPR for each class manually after predicting the class labels on the validation set in each fold. The "kfoldPredict" function in MATLAB can be utilized to obtain the predictions for each fold. These predictions can then be compared to the true labels to calculate the TPR for each class.
Here is how you can modify your code snippet for one of the models to include TPR calculation for both classes:
% Initialize matrices for TPRs for both classes
indTPRClass1 = zeros(10, 6);
indTPRClass2 = zeros(10, 6);
% Example for FineKNN model
FineKNN = fitcknn(FV, Lbl, 'CVPartition', Partitions, 'NumNeighbors', 1, 'Standardize', 1);
predictions = kfoldPredict(FineKNN);
for fold = 1:10
% Extract the test indices for this fold
testIdx = test(Partitions, fold);
trueLabels = Lbl(testIdx);
predLabels = predictions(testIdx);
% Calculate TPR for each class
indTPRClass1(fold, 1) = sum(predLabels == 1 & trueLabels == 1) / sum(trueLabels == 1);
indTPRClass2(fold, 1) = sum(predLabels == 0 & trueLabels == 0) / sum(trueLabels == 0);
end
% Repeat similar steps for other models and fill in indTPRClass1 and indTPRClass2 accordingly
To learn more about “kfoldPredict” usage and syntax, you may refer to the link below: -
I hope this helps!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by