Problem running a cvpartition with a tall array
1 次查看(过去 30 天)
显示 更早的评论
MATLAB's documentation indicates that cvpartition function is support for tall arrays, as long as it uses a stratified holdout partition. Therefore, this should work, when "group" is a Mx1 double column vector pulled out of a datastore:
myPartition = cvpartition(group, 'Holdout',.25);
A = gather(test(myPartition));
It yields the proper logical vector if I load the "group" array into memory. But as a tall array, I instead get this error:
Error using internal.stats.bigdata.cvpartitionTallImpl (line 92)
P is too small to have a non-empty test set.
There gather operation is not the issue here; this is the first command applied to the tall array after it is created.
I think I've tracked down the cause of that error to a this bit of code in the cvpartitionInMemoryImpl class:
if (isempty(cv.Group) && floor(cv.N *T) == 0) ||...
(~isempty(cv.Group) && floor(length(cv.Group) * T) == 0)
error(message('stats:cvpartition:PTooSmall'));
end
Where T is (at least supposed to be) the 0.25 probability value.
Is there a way around this error? I'm working with some very, very large data files and would like to take advantage of the tall array functionality wherever possible.
0 个评论
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Model Building and Assessment 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!