problems when adding a column to tall array

24 次查看(过去 30 天)
I'm trying to add a new column to an already existing tall array.
The tall array is based on a datastore.
DS = datastore('Matlab_DS');
Table = tall(DS);
I gather a column of the table, do some calculations and save the results in a new cell array.
Column1 = gather(Table.Column1); % 4975885x1 cell
ColumnNew = SomeFunction(Column1); % 4975885x1 cell (fully evaluated at this point)
Then I'm trying to add the newly calculated column to the original table.
Table.ColumnNew = tall(ColumnNew);
The following error message appears when I try to gather the Table:
Incompatible tall array arguments. The first dimension in each tall array must have the same size, and each tall array must be based on the same
datastore.
Thinking it is a problem with the datastore I try to concatenate a new tall array
Table2 = [Table tall(ColumnNew)];
When I gather Table2 now, a slightly different error message appears:
Incompatible tall array arguments. The first dimension in each tall array must have the same size, or have a size of 1.
I can gather ColumnNew, and it shows the same number of rows as the Table, so that shouldn't be a problem.
However, this works:
Table2 = [gather(Table.Column1) ColumnNew];
If I gather the whole Table first and then concatenate the new column it seems to work. However, the full Table is too big to fit into memory. It is not possible for me to gather the full table...
Am I doing something wrong, is this some kind of bug, or is it simply not possible to add a new column to a tall table this way?
My MATLAB release: 2016b

采纳的回答

Josh Meyer
Josh Meyer 2017-8-28
In the first case the reason you get an error is because you are attempting to add a tall variable to the table that was created from an in-memory array. Instead of gathering, working on in-memory data, then converting back to tall and trying to append to the table, you can simply do
Table.ColumnNew = SomeFunction(Table.Column1)
For example:
ds = datastore('airlinesmall.csv','TreatAsMissing','NA');
ds.SelectedVariableNames = 'ArrDelay';
t = tall(ds)
t.Sine= sin(t.ArrDelay) %works
t.Sine2= tall(gather(sin(t.ArrDelay))) %doesn't work
In the second case, you are attempting to concatenate a column vector with a table, which does not work even for in-memory arrays.

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Tall Arrays 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by