Crossvalidation: anonymous function handle with toolbox classifiers
显示 更早的评论
Hi everyone,
I'll like to use the matlab crossvalidation function (crossval) with a randomforest classification toolbox (specifically http://code.google.com/p/randomforest-matlab/). As the predfun is defined in the documentation ( http://www.mathworks.com/help/toolbox/stats/crossval.html) I should give a function that retrieves the predictions for a set of test data XTEST. So, in agreement with the syntax, I should give a function like this:
classf= @(XTRAIN,ytrain,XTEST) classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000));
such function takes as input the XTEST, the model itself that needs XTRAIN and ytrain. The problem comes when I try to run the cross validation, getting the follow error message.
cvMCR = crossval('mcr',X,y,'predfun',classf)
Error using crossval>evalFun (line 465)
The function
'@(XTRAIN,ytrain,XTEST)classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000))'
generated the following error:
Cannot concatenate a double array and a nominal array.
Error in crossval>getLossVal (line 502)
funResult = evalFun(funorStr,arg(1:end-1));
Error in crossval (line 401)
[funResult,outarg] = getLossVal(i, nData, cvp, data,
predfun);
I'll really appreciate help.
Regards!
回答(4 个)
Ilya
2012-4-26
I think you've hit a bug in the crossval function. My guess is that classRF_predict returns numeric labels, and crossval does not process them correctly for the 'mcr' criterion. The workaround is to convert class labels returned by classRF_predict to the nominal type:
classf= @(XTRAIN,ytrain,XTEST) nominal(classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000)));
and execute the call to crossval in the same way as before
cvMCR = crossval('mcr',X,y,'predfun',classf)
Alternatively, you could use the other signature for crossval
vals = crossval(fun,X,y)
and define
fun = @(Xtrain,Ytrain,Xtest,Ytest) mean(Ytest ~= classRF_predict(Xtest,classRF_train(Xtrain,Ytrain,1000)));
In this case, since you are comparing the true and predicted labels yourself, you can keep them numeric.
Let me know if either solution works for you.
Ilya
2012-4-26
0 个投票
I am not an expert on the randomforest-matlab package, so my advice could be off. I find two things in your post worth investigating:
- It is strange that you use Xtest as the 1st input to classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000)). Usually it is the trained object that is the 1st argument.
- Make sure that the array of class labels, y, you pass to crossval has the same type as labels returned by classRF_predict.
Cristobal
2012-4-26
1 个评论
Ilya
2012-4-27
Did you see my answer above?
You can do modify crossval if you'd like, but in that case do
temploss = sum(outarg ~= nominal(funResult));
That way you can continue using crossval with labels of all types. After what you did, you can only use crossval with handles that return labels of type double.
类别
在 帮助中心 和 File Exchange 中查找有关 Gaussian Process Regression 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!