Why is loss() different from calculating misclassification error using predict()?
2 次查看(过去 30 天)
显示 更早的评论
I am trying to fit an ECOC model to my data but the misclassification calculated from loss() is different to the misclassification calculated by comparing the predicted labels from predict() with the true labels. The same thing happens when using a different model i.e. KNN.
Even though the test dataset has 10 observations, where the misclassification error should be a multiple of 0.1 to my knowledge, loss() outputs 0.8293.
Could someone please help me understand why these are different, i.e. what is going on with the loss() function? And which is more appropriate for evaluating/reporting test set accuracy.
rng(1234)
% define variables
xtrain = rand(100,4); % random numbers, n = 100
xtest = rand(10,4); % random numbers, n = 10
ytrain = ceil(4*rand(100,1)); % 4 classes, n = 100
ytest = ceil(4*rand(10,1)); % 4 classes, n = 10
% train model
mdl1 = fitcecoc(xtrain,ytrain,'Coding','onevsall','Learners','svm');
mdl2 = fitcknn(xtrain,ytrain);
% calculate loss from loss()
loss1mdl1 = loss(mdl1,xtest,ytest);
loss1mdl2 = loss(mdl2,xtest,ytest);
% calculate loss from predict()
loss2mdl1 = 1-mean(predict(mdl1,xtest)==ytest);
loss2mdl2 = 1-mean(predict(mdl2,xtest)==ytest);
0 个评论
回答(2 个)
Sulaymon Eshkabilov
2023-6-29
There is a small difference between loss() and predict() fcns. The difference of loss is coming from the calculation of loss fcn value thta considers weight for observation. Otherwise, everything is working as expected:
rng(1234)
% define variables
xtrain = rand(100,4); % random numbers, n = 100
xtest = rand(10,4); % random numbers, n = 10
ytrain = ceil(4*rand(100,1)); % 4 classes, n = 100
ytest = ceil(4*rand(10,1)); % 4 classes, n = 10
% train model
mdl1 = fitcecoc(xtrain,ytrain,'Coding','onevsall','Learners','svm');
mdl2 = fitcknn(xtrain,ytrain);
% calculate loss from loss()
loss1mdl1 = loss(mdl1,xtest,ytest)
loss1mdl2 = loss(mdl2,xtest,ytest)
Y1 = predict(mdl1,xtest);
Y2 = predict(mdl2,xtest);
YC1 = [ytest,Y1] % Two correct answers out of 10, i.e., accuracy is 20%
YC2 = [ytest,Y2] % Three correct answers out of 10, i.e., accuracy 30%
% calculate loss from predict()
loss2mdl1 = 1-mean(predict(mdl1,xtest)==ytest)
loss2mdl2 = 1-mean(predict(mdl2,xtest)==ytest)
0 个评论
Drew
2025-5-7
This is because the classreg loss function is normalizing the observation weights so that they sum to the prior probability in the respective class. This can be avoided by providing a custom loss function, as seen in this answer: https://www.mathworks.com/matlabcentral/answers/492062-loss-the-classification-error
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Classification Ensembles 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!