For Loop or function for repeating action

I have A with 225 x 2 vectors. One Column is a variable always ranking from 1-5 (like grades) and the second is also numeric. I now want to calculate the mean, median, first and third quantile of the second vector, for each grade score.
The result I need, need to be interpreted like: mean(age) of A students better than mean(age) of B students
Grades 1 2 3 [etc]
Mean
Median
1st Qntl
3rd Qntl
I did it all by manually, which is kind of a lot, because I have 8 hypothesis for which the calculations are almost the same (the matrix A is in reality 225*11 but I only need 2-3 vectors per hypothesis). Now I wonder if there is a way to "do it faster and more efficient" namely in a for loop?
where I can write something like:
for i = 1:5
if ERM == i
mean_Hyp_1 = nanmean(A(ERM==1;:,2))
meadian_Hyp_1 = nanmedian(A(ERM==i;:,2)
etc
end
end
Thanks in advance

 采纳的回答

You had the right idea. "find" function can be used to find all the rows where ERM == 1,2,.. in a loop and the result can be calculated.
Let me show this via an example:
a = [1;3;2;4;5;1;2;4;3;5;3;2;1]
b = [10;15;24;54;36;57;87;98;65;78;05;48;65]
input = [a b]
mean = []
median = []
for i = 1:5
mean(i) = nanmean(input(find(input(:,1)==i), 2))
median(i) = nanmedian(input(find(input(:,1)==i), 2))
end
I the case above, we are using the "find" function on the first column of input, extracting the indices for all values of input(:,1) == i and finding the mean of all the values from the second column.

9 个评论

Vishwas,
I have a question about your answer. When I put your code in an m-file in my R2017a, the find parts have a red underline, telling me that
If 'input' is an indexed variable, performance can be increased using logical indexing instead of FIND.
If I click fix, the word find is removed (matching the answer I have given above).
Would one be better than the other in this example (and in general)?
It is better to avoid using "input" as a variable name, due to conflict with the frequently-used input() function.
Omitting the find() is more efficient.
Hello Tim and Vishwas,
thank you both for answering.
I have now tried both of your codes and I get an error because the vector B for vector A=1,2..etc doesn't always have the same size.
Subscripted assignment dimension mismatch.
Error in HypotheseEins (line 54)
mean_A(i) = nanmean(A(A(:,1)==i,2:end));
AND
Subscripted assignment dimension mismatch.
Error in HypotheseEins (line 49)
mean(i) = nanmean(A(find(A(:,1)==i), 2:end))
Please don't wonder about my slightly differences. I said above that I my matrix is actually bigger than I used in the example to ask my question. Do I have to first somehow "fill" the smaller vectors with zeros to the maxlength of the biggest vector?
Thank you very much.
mean_A(i) = nanmean(A(A(:,1)==i,2:end));
Takes the mean over multiple columns (2:end) as you seem to want it. But then it will also give multiple means (one for each column, as you want).
But you are still trying to put those multiple values into mean_A(i), which is a single location in the array mean_A.
Try
mean_A(i,:) = nanmean(A(A(:,1)==i,2:end));
Thank you very much! It works!
I changed it to display the grades as columns.
mean_A(:,i) = nanmean(A(A(:,1)==i,2:end));
My code looks more tidied up now and I can even put hypothesis together in one script.
I used display(mean_A) for the results to show in a "table" form. Do you by any chance know how I can name the rows and columns of the result?
@Vishwas Vijaya Kumar: is there a good reason for shadowing the inbuilt input function?
And mean().
And median().
And that's some pretty tortured indexing.

请先登录,再进行评论。

更多回答(1 个)

You can use the condition A(:,1) == i as indexing for which values in A(:,2) to consider, i.e.
A = [1 2 3 1 1 2 3; 4 5 6 7 8 9 0]'
for i = 1:3
mean_A(i) = nanmean(A(A(:,1)==i,2));
% etc..
end

1 个评论

Hi Tim,
Thanks to you my codes for all my hypothesis are hapening so much faster. Now I am on my last hypothesis, which is the same method as before with one constraint. Before I open a new question, I just wanted to see, if you can help.
matrix A with 5 columns. First column with grades (1-5) and second column with years ranking from 2008-2013. Rest of columns again numeric.
First: "Cluster" the years 2008-2010, 2011-2013, 2014-2016
Second: Search Grades between the years 2008-2010, 2011-2013, 2014-2016
Third: Calculate the means of every column according to grade and clustered year.
The main problem I have encountered is that Matlab doesn't let me write the expression
for i = 2008:2010 ...etc
I did it again manually (mean of each year for all variables). But I cannot include, like your previous code showed me.
for i= 1:5
...(A(:,i)==i)..etc

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Creating and Concatenating Matrices 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by