Crossvalidation: anonymous function handle with toolbox classifiers
9 次查看(过去 30 天)
显示 更早的评论
Hi everyone,
I'll like to use the matlab crossvalidation function (crossval) with a randomforest classification toolbox (specifically http://code.google.com/p/randomforest-matlab/). As the predfun is defined in the documentation ( http://www.mathworks.com/help/toolbox/stats/crossval.html) I should give a function that retrieves the predictions for a set of test data XTEST. So, in agreement with the syntax, I should give a function like this:
classf= @(XTRAIN,ytrain,XTEST) classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000));
such function takes as input the XTEST, the model itself that needs XTRAIN and ytrain. The problem comes when I try to run the cross validation, getting the follow error message.
cvMCR = crossval('mcr',X,y,'predfun',classf)
Error using crossval>evalFun (line 465)
The function
'@(XTRAIN,ytrain,XTEST)classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000))'
generated the following error:
Cannot concatenate a double array and a nominal array.
Error in crossval>getLossVal (line 502)
funResult = evalFun(funorStr,arg(1:end-1));
Error in crossval (line 401)
[funResult,outarg] = getLossVal(i, nData, cvp, data,
predfun);
I'll really appreciate help.
Regards!
0 个评论
回答(4 个)
Ilya
2012-4-26
I think you've hit a bug in the crossval function. My guess is that classRF_predict returns numeric labels, and crossval does not process them correctly for the 'mcr' criterion. The workaround is to convert class labels returned by classRF_predict to the nominal type:
classf= @(XTRAIN,ytrain,XTEST) nominal(classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000)));
and execute the call to crossval in the same way as before
cvMCR = crossval('mcr',X,y,'predfun',classf)
Alternatively, you could use the other signature for crossval
vals = crossval(fun,X,y)
and define
fun = @(Xtrain,Ytrain,Xtest,Ytest) mean(Ytest ~= classRF_predict(Xtest,classRF_train(Xtrain,Ytrain,1000)));
In this case, since you are comparing the true and predicted labels yourself, you can keep them numeric.
Let me know if either solution works for you.
0 个评论
Ilya
2012-4-26
I am not an expert on the randomforest-matlab package, so my advice could be off. I find two things in your post worth investigating:
- It is strange that you use Xtest as the 1st input to classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000)). Usually it is the trained object that is the 1st argument.
- Make sure that the array of class labels, y, you pass to crossval has the same type as labels returned by classRF_predict.
0 个评论
Cristobal
2012-4-26
1 个评论
Ilya
2012-4-27
Did you see my answer above?
You can do modify crossval if you'd like, but in that case do
temploss = sum(outarg ~= nominal(funResult));
That way you can continue using crossval with labels of all types. After what you did, you can only use crossval with handles that return labels of type double.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Gaussian Process Regression 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!