problems when adding a column to tall array
12 次查看(过去 30 天)
显示 更早的评论
I'm trying to add a new column to an already existing tall array.
The tall array is based on a datastore.
DS = datastore('Matlab_DS');
Table = tall(DS);
I gather a column of the table, do some calculations and save the results in a new cell array.
Column1 = gather(Table.Column1); % 4975885x1 cell
ColumnNew = SomeFunction(Column1); % 4975885x1 cell (fully evaluated at this point)
Then I'm trying to add the newly calculated column to the original table.
Table.ColumnNew = tall(ColumnNew);
The following error message appears when I try to gather the Table:
Incompatible tall array arguments. The first dimension in each tall array must have the same size, and each tall array must be based on the same
datastore.
Thinking it is a problem with the datastore I try to concatenate a new tall array
Table2 = [Table tall(ColumnNew)];
When I gather Table2 now, a slightly different error message appears:
Incompatible tall array arguments. The first dimension in each tall array must have the same size, or have a size of 1.
I can gather ColumnNew, and it shows the same number of rows as the Table, so that shouldn't be a problem.
However, this works:
Table2 = [gather(Table.Column1) ColumnNew];
If I gather the whole Table first and then concatenate the new column it seems to work. However, the full Table is too big to fit into memory. It is not possible for me to gather the full table...
Am I doing something wrong, is this some kind of bug, or is it simply not possible to add a new column to a tall table this way?
My MATLAB release: 2016b
0 个评论
采纳的回答
Josh Meyer
2017-8-28
In the first case the reason you get an error is because you are attempting to add a tall variable to the table that was created from an in-memory array. Instead of gathering, working on in-memory data, then converting back to tall and trying to append to the table, you can simply do
Table.ColumnNew = SomeFunction(Table.Column1)
For example:
ds = datastore('airlinesmall.csv','TreatAsMissing','NA');
ds.SelectedVariableNames = 'ArrDelay';
t = tall(ds)
t.Sine= sin(t.ArrDelay) %works
t.Sine2= tall(gather(sin(t.ArrDelay))) %doesn't work
In the second case, you are attempting to concatenate a column vector with a table, which does not work even for in-memory arrays.
0 个评论
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Tall Arrays 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!