Find average based on multiple columns of data structure

2 次查看(过去 30 天)
I have data in the following format:
A B C
1 1 1 a
2 2 1 0
3 3 1 a
4 4 1 a
5 6 1 0
6 1 1 a
7 2 1 a
8 4 1 a
9 6 1 0
10 1 2 b
11 2 2 b
12 3 2 0
. . . .
. . . .
. . . .
99 1 20 c
100 2 20 0
101 6 20 c
I want to find the average of the values in Column-C for the unique values in Column-B for the unique values in Column-A. For example, average the non-zero values in Column-C (say, "a"), where the value of Column-B is "1" AND the value of Column-A is "1"; then do the same for Column-C values of "b" with Column-B values of "2" AND Column-A values of "1"; and so-forth and so-on through all the entries.
I have tried looping over the values in Column-A along with a loop over the values in Column-B, with no success. I have thought about "unique" or "foreach", but I have little experience with those functions, and I have not had time to attempt an implementation.
Thoughts?
  3 个评论
Image Analyst
Image Analyst 2014-8-8
Please make it easy for us to help you, not hard . Many/most of us are waiting for you to upload code to generate such a matrix, table, or cell array before we try anything.
Paul Huter
Paul Huter 2014-8-9
The data is coming from an Excel file, which I am reading in with 'xlsread':
[data raw text] = xlsread(FILE)
I am then trying to work with 'data', and tried:
for i = 1:length(data(:,1))
for j = 1:length(data(:,2))
...
end
end
But that did not work. I thought about unique or foreach(unique), but I am not comfortable enough with their syntax and the HELP in Matlab really did not help all that much. I know I have used code, before, which used those functions, so it may be what I need to do, but assistance with coding it up would be appreciated (just getting me started would be great).
I realized that I had an error in my original post. Looking at it again, I would like to average the following (row-column):
1-C and 6-C and other Column-A = 1 and Column-B = 1
2-C and 7-C and other Column-A = 2 and Column-B = 1
3-C and other Column-A = 3 and Column-B = 1
4-C and 8-C and other Column-A = 4 and Column-B = 1
5-C and 9-C and other Column-A = 6 and Column-B = 1
10-C and other Column-A = 1 and Column-B = 2
11-C and other Column-A = 2 and Column-B = 2
12-C and other Column-A = 3 and column-B = 2
...etc
Any pointers, code snippets, or full functions are appreciated.
Thank you.

请先登录,再进行评论。

回答(1 个)

Nir Rattner
Nir Rattner 2014-8-8
Assuming your data is stored in a matrix, you can use the "unique" and "accumarray" functions. First, it seems that you want to remove all rows where the the third column is equal to zero. Next, you can pull the unique row entries considering only the first two columns to get your subscript for the "accumarray" function. Finally, you can use the "accumarray" function to take the average grouped by your unique subscripts:
M(M(:, 3) == 0, :) = [];
[i,~,j] = unique(M(:, [1,2]), 'rows');
[i, accumarray(j, M(:, 3), [], @mean)]

类别

Help CenterFile Exchange 中查找有关 Large Files and Big Data 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by