How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

Question

Leon 2021-10-19

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1567308-how-do-i-identify-unique-rows-based-on-multiple-columns-and-calculate-the-average-of-the-rest-of-the

评论： Leon 2021-10-19

I have a matrix as below:

A = [1 4 3 8; 4 5 6 9; 1 6 3 6; 2 6 9 3; 1 5 3 7];

My goal is to identify rows with both identical Column 1 and identical Column 3 values, and then calculate the average for the rest of the columns, i.e., Column 2 and Column 4, within these duplicate rows. In this example, the duplicate rows would be Rows # 1, 3, and 5. My ending matrix would be:

B = [1 5 3 7; 4 5 6 9; 2 6 9 3];

This is a much simplified example. In reality, I have 35 columns that need to be averaged, and millions of rows. What is the most efficient way of handling this? Do I have to write a loop and process each of the unique rows individually?

Many thanks!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Kevin Holly 2021-10-19

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1567308-how-do-i-identify-unique-rows-based-on-multiple-columns-and-calculate-the-average-of-the-rest-of-the#answer_812193

编辑：Kevin Holly 2021-10-19

在 MATLAB Online 中打开

A = [1 4 3 8; 4 5 6 9; 1 6 3 6; 2 6 9 3; 1 5 3 7]
A = 5×4
   4     3     8
   5     6     9
   6     3     6
   6     9     3
   5     3     7

I am going to assume that any row that columns 1 and 3 are identical, irregardless of what pair, you want to ignore those rows in the averaging of other columns.

Here is my approach:

t = table(A(:,1),A(:,3))
t = 5×2 table
    Var1    Var2
    ____    ____

     1       3  
     4       6  
     1       3  
     2       9  
     1       3  
[C, ia, ic] = unique(t,'rows')
C = 3×2 table
    Var1    Var2
    ____    ____

     1       3  
     2       9  
     4       6  
ia = 3×1
     1
     4
     2
ic = 5×1
     1
     3
     1
     2
     1
ic==1
ans = 5×1 logical array
   1
   0
   1
   0
   1
A(ic==1,:)
ans = 3×4
     1     4     3     8
     1     6     3     6
     1     5     3     7
mean(A(ic==1,:))
ans = 1×4
     1     5     3     7
B = [mean(A(ic==1,:));A(ic~=1,:)]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

Your answer:

B = [1 5 3 7; 4 5 6 9; 2 6 9 3]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

Without showing work:

t = table(A(:,1),A(:,3));
[~,~,ic] = unique(t,'rows');
B = [mean(A(ic==1,:));A(ic~=1,:)]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Leon 2021-10-19

Many thanks for the solution!

请先登录，再进行评论。

How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论