How can I speed up this indexing code?
13 次查看(过去 30 天)
显示 更早的评论
I have a cell array in which each cell contains a string. E.g....
a={'AAAAAA';'BBBBBB';'CCCCCC';'AAAAAA';'DDDDDD'};
Each cell in array a is associated with a row in a numerical array that contains 10 columns. E.g....
b=[0,0,0,1,1,0,0,0,0,0;
0,1,0,0,0,0,0,0,0,0;
1,0,1,0,1,0,2,0,0,0;
3,0,0,0,0,0,0,0,0,1;
0,0,0,0,0,0,2,0,0,1];
Some strings in a are repeated such as 'AAAAAA' as shown above. What I need to do is find all repeated cells in a and sum the assocated columns from b into a single row. This should result in two new arrays (unia and bnew) which have equal numbers of rows but every string in unia is unique.
Easy enough to do with a loop such as:
unia=unique(a);
bnew=zeros(numel(unia),10);
for n=1:numel(unia)
pos=find(strcmp(a,unia{n}));
bnew(n,:)=sum(b(pos,:),1);
end
This works fine for small arrays but I have a case where a has 6 million cells and unia has 300,000 cells so I need something much faster. Any ideas?
Thanks!
0 个评论
采纳的回答
Ive J
2021-10-28
Avoid comparing strings within the loop and instead take advantage of the index vector from unique:
a = ["A", "B2", "A", "C", "AA", "B2", "B2"]; % use strings instead of cell array of characters, they're much more efficinet to work with
b = randi([0 2], numel(a), 3)
[anew, ~, idx] = unique(a);
bnew = arrayfun(@(x) sum(b(x == idx, :), 1), 1:numel(anew), 'uni', false);
bnew = vertcat(bnew{:})
anew
Also, you can use tall arrays when dealing with large arrays.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Matrix Indexing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!