KNN classifier with ROC Analysis
5 次查看(过去 30 天)
显示 更早的评论
Hi Smart guys,
I wrote following codes to get a plot of ROC for my KNN classifier:
load fisheriris;
features = meas;
featureSelcted = features;
numFeatures = size(meas,1);
%%Define ground truth
groundTruthGroup = species;
%%Construct a KNN classifier
KNNClassifierObject = ClassificationKNN.fit(featureSelcted, groundTruthGroup, 'NumNeighbors', 3, 'Distance', 'euclidean');
% Predict resubstitution response of k-nearest neighbor classifier
[KNNLabel, KNNScore] = resubPredict(KNNClassifierObject);
% Fit probabilities for scores
groundTruthNumericalLable = [ones(50,1); zeros(50,1); -1.*ones(50,1)];
[FPR, TPR, Thr, AUC, OPTROCPT] = perfcurve(groundTruthNumericalLable(:,1), KNNScore(:,1), 1);
Then we can plot the FPR vs TPR to get the ROC curve.
However, the FPR and TPR is different from what I got using my own implementation that the one above will not display all the points, actually, the codes above display only three points on the ROC. The codes I implemented will dispaly 151 points on the ROC as the size of the data is 150.
patternsKNN = [KNNScore(:,1), groundTruthNumericalLable(:,1)];
patternsKNN = sortrows(patternsKNN, -1);
groundTruthPattern = patternsKNN(:,2);
POS = cumsum(groundTruthPattern==1);
TPR = POS/sum(groundTruthPattern==1);
NEG = cumsum(groundTruthPattern==0);
FPR = NEG/sum(groundTruthPattern==0);
FPR = [0; FPR];
TPR = [0; TPR];
May I ask how to tune '`perfcurve`' to let it output all the points for the ROC? Thanks a lot.
A.
1 个评论
Alessandro
2013-3-20
编辑:Alessandro
2013-3-20
try adding 'xvals','all' [FPR, TPR, Thr, AUC, OPTROCPT] = perfcurve(groundTruthNumericalLable(:,1), KNNScore(:,1), 1,'xvals','all');
采纳的回答
Ilya
2013-3-19
For 3 neighbors, the posterior probability has at most 4 distinct values, namely (0:3)/3. Likely less for the Fisher iris data because the classes are well separated. With 4 distinct score values, you won't see more than 4 points on the ROC curve. Your implementation does not account for such ties.
2 个评论
Ilya
2013-3-20
Yes, it does mean that your implementation is wrong. As I said, you can't have more points on a ROC curve than distinct threshold values. This is actually quite simple - you just need to think about it.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 ROC - AUC 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!