Reducing a cell array of tables to a single table
16 次查看(过去 30 天)
显示 更早的评论
I am using a one-dimensional cell array to save a set of tables.
The necessity for this arises from using a parfor loop in the main part of the program, where each i'th output is a table of results, and outputs must be generated in parallel. I would like to save everything in one table, but the order must be preserved. Since parfor restricts indexing, the best way I have found is to create said cell array, and afterwards looping through it. Since each table corresponds to a single index, Matlab happily accepts this indexing in the parallel loop.
Each iteration returns a table of length maxT with some amount of columns that I determine dynamically. I basically pre-allocate then the table mainTable and loop over my cell array to fill it. To set the correct indecies, I use a vector called asdf, which tells me which rows of mainTable should belong to a given iteration i (there's other ways to do this, this just came out of trying to make parfor work). If that seems confusing, just think of me looping through the cell array, and appending the table in cell i onto mainTable.
The issue is now that the second loop becomes rather slow, because it is not parallelized. Although the main work happens in the first parfor loop and therefore the current solution is still better than without parfor, I would very much like to make the reduction to a single table fast.
Even though I know the position of each table within mainTable (e.g. see variable "asdf"), I can not index with such slices in a parfor loop. The code below, which does this without parfor, works.
Some things which do not work:
cell2table(resultCell) gives a table of tables. No join or union on this is successful
resultCell{:} theoretically gives a list of all tables, but using [resultCell{:}] gives an error because of column duplication. Otherwise only the first table is extracted.
I did not find a way to parallelize the assignment to mainTable, because I always need to slice from a starting point to and ending point.
Any ideas?
parfor i=1:NrSims
%% Do something
% resulttable is a table of length maxT
resultCell{i}=resultTable;
end
%% Create main table
% Here I preallocate mainTable etc.
(...)
% Next, I create this index vector which allows me to slice mainTable for each i
asdf=kron(1:NrSims,ones(maxT,1)')';
for i=1:NrSims
slice=(asdf==i);
mainTable(slice, :) = resultCell{i};
end
4 个评论
Guillaume
2019-6-25
编辑:Guillaume
2019-6-25
I sincerely hope you're not actually naming your variable asdf! Giving variables a meaningful name (such as tableorder in this case) is the first step of documenting code.
as for your question, it seems you need to understand what cellarray{:} does, and thus the difference betwen [cellarray{:}] (aka horzcat(cellarray{:})) and vertcat(cellarray{:}). See Stephen's answer.
Note that:
asdf=kron(1:NrSims,ones(maxT,1)')';
is more simply:
asdf = repelem(1:NrSims, maxT)' %which is a lot clearer as to the intent
采纳的回答
Stephen23
2019-6-25
编辑:Stephen23
2022-2-23
2 个评论
David Kelly
2020-8-18
Stephen,
I just wanted to say thanks for contributing and solving so many to all these questions on the Malab answers/
The amount of times you have saved me is unbelievable!
Cheers
David
更多回答(1 个)
Campion Loong
2019-6-27
Hi Ingo,
Glad you've found a solution. In case it maybe useful in your workflow, I'd like to mention the various Datastores available to you in base MATLAB:
PARFOR support is builtin via partition, so you don't need to explicitly manage the chunking and remerge. It also lets you scale out to other resources like clusters more easily.
Hope this helps.
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Parallel for-Loops (parfor) 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!