kfold of matlab shuffles the samples in the training and test set, appareantly it cannot be garanteed that one specific subject is not included in the training set. There is a loss function which takes an input argument called "usenfort" showing which input in each partition should be used for testing. There one can see that the included samples in the test are shuffled.
customized Loss function for cross validation
1 次查看(过去 30 天)
显示 更早的评论
I trained a decision tree regression model with the following code:
MdlDeep = fitrtree(X,Y,'KFold',SbjNm,'MergeLeaves','off', 'MinParentSize',1,'Surrogate','on');
and customized the loss function to test the model accuracy:
LossEst(OutCnt)=kfoldLoss(CllTr{OutCnt},'LossFun',@TstLossFunIn);
the customized loss function was:
function lossvalue = TstLossFunIn(C,S,W)
DffTtl=(C-S).^2;
DffTtl=DffTtl.*W;
SSE=sum(DffTtl); SSTM=mean((C-mean(C)).^2);
lossvalue=(SSE/SSTM);
this results in a reasonable loss given my problem. However, I wanted to control the cross-validation procedure, so I modified the code to split the traning and testing dataset myself and see how the model performs:
for SbjCnt=1:SbjNm
TrnDt=X;
TrnDt(SbjCnt,:)=[];
TrnOut=Y;
TrnOut(SbjCnt)=[];
MdlDeep = fitrtree(TrnDt,TrnOut,'MergeLeaves','off','MinParentSize',1,'Surrogate','on');
TstDt=XS(SbjCnt,:);
EstY=predict(MdlDeep,TstDt);
end
Now I wanted to calculate the loss function. The thing is that in this case, the calculated loss is very much different from the loss function in the first scenario and the model does not seem to be accurate at all.
Any hint, why this works like that?
Best regards,
Afshin
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Statistics and Machine Learning Toolbox 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!