Matlab find unique column-combinations in matrix and respective index

Question

Benvaulter 2017-3-22

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index

编辑： Jan 2017-3-23

I have a large matrix with with multiple rows and a limited (but larger than 1) number of columns containing values between 0 and 9 and would like to find an efficient way to identify unique row-wise combinations and their indices to then build sums (somehwat like a pivot logic). Here is an example of what I am trying to achieve:

a =

uniqueCombs =

   2     3
   2     3
   2     1

numOccurrences =

 2
 1
 2

indizies:

[1;4]
[2]
[3;5]

From matrix a, I want to first identify the unique combinations (row-wise), then count the number occurrences / identify the row-index of the respective combination.

I have achieved this through generating strings with num2str and strcat, but this method appears to be very slow. Along these thoughts I have tried to find a way to form a new unique number through concatenating the values horizontally, but Matlab does not seem to support this (e.g. from [1;2;3] build 123). Sums won't work because they would remove the possibility to identify unique combinations. Any suggestions on how to best achieve this? Thanks!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Guillaume 2017-3-22

2
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index#answer_259890

在 MATLAB Online 中打开

More or less the same as Jan's, using accumarray instead of splitapply (I'm still old school!):

A = [ 1     2     3
      2     2     3
      3     2     1
      1     2     3
      3     2     1];
[B, ~, ib] = unique(A, 'rows');
numoccurences = accumarray(ib, 1);
indices = accumarray(ib, find(ib), [], @(rows){rows});  %the find(ib) simply generates (1:size(a,1))'

4 个评论
显示 2更早的评论隐藏 2更早的评论

Guillaume 2017-3-23

编辑：Guillaume 2017-3-23

在 MATLAB Online 中打开

I suspect that accumarray will be faster as it is built-in compiled code whereas splitapply is m code, but I haven't conducted any test.

Note: for the indices,

indices = accumarray(ib, (1:numel(ib))', [], @(rows){rows});

is probably slightly faster, just not as concise.

Jan 2017-3-23

编辑：Jan 2017-3-23

在 MATLAB Online 中打开

@Guillaume: I compare this with cellfun: In older versions Matlab contained the C-sources for this Mex function. Here calling a function handle is very expensive, because the Matlab tier has to be called. Therefore the implicitely defined methods provided by strings are much faster: 'length', 'isclass' etc.

Then using a compiled Mex function is not a real benefit, because mexCallMATLAB has some overhead. This might concern accumarray also. I guess that your accumarray approach is faster than the loop, but I know that it looks very cryptic ;-)

But now I can leave the speculations and run a test: With

A = randi([1, 100], 1e5, 3); % Test data

my loop takes 14.75 seconds, your accumarray approach takes 0.44 seconds. The results differ in the order of the indices. So perhaps this is wanted:

[B, iB, iA] = unique(A, 'rows');
indices     = accumarray(iA, (1:numel(iA)).', [], @(r){sort(r)});

The result is clear: @Benvaulter, please unaccept my answer and select Guillaume's, and of course use it also to save time and energy.

请先登录，再进行评论。

Answer 2

Jan 2017-3-22

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index#answer_259879

编辑：Jan 2017-3-23

在 MATLAB Online 中打开

A = [ 1     2     3; ...
      2     2     3; ...
      3     2     1; ...
      1     2     3; ...
      3     2     1];
[B, iB, iA] = unique(A, 'rows');
G = unique(iA);
numOccurrences = splitapply(@sum, iA, G);

I cannot test a method to obtain the indices list as wanted. I assume this works with splitapply also. A simple loop approach at least:

n = length(G);
indices = cell(1, n);
for k = 1:n
  indices{k} = find(iA == G(k));
end

[EDITED] Code is tested now. Use the much faster solution of Guillaume for productive work.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Benvaulter 2017-3-23

Perfect solution to my problem - thanks a lot!

请先登录，再进行评论。

Matlab find unique column-combinations in matrix and respective index

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

4 个评论
显示 2更早的评论隐藏 2更早的评论

更多回答（1 个）

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

Matlab find unique column-combinations in matrix and respective index

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

4 个评论 显示 2更早的评论隐藏 2更早的评论

更多回答（1 个）

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

WeChat

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

4 个评论
显示 2更早的评论隐藏 2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论