if loop within for loop for statistical analysis of data
1 次查看(过去 30 天)
显示 更早的评论
Hi,
I am having a code with data, that consists of a very large column vector in the form of:
P_b=[2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;Nan];
For that vector, I would like to group all consecutive non-NaN values, i.e. [2;3;4;5;6],[3;4;5;6] etc. fit a normal distribution to them, extract the mean, and have the result come up in a vector. This vector includes all the means of the 'grouped' data of P_b.
May sound kind of complicated but it shouldn't be. I have created the code below, however an odd problem that arrises is that MATLAB does not recognise the variable 'avg', when at the end of the for-loop, I am trying to save all for-loop results in a vector. However when I run the code without that last line, it seems to recognise the variable 'avg'. Any ideas? Thanks in advance for your help. Below is the code.
P_pdf=[];
%Inices with NaN
idxnan=find(isnan(P_b));
for i=1:size(idxnan,1)-1
%Indices of numeric values
idxlow=idxnan(i)+1;
idxup=idxnan(i+1)-1;
%Group P_b Matrices according to NaN values
P_mat=P_b(idxlow:idxup);
%Reject empty matrices and treat singular values
if size(P_mat)==[1,1];
avg=P_mat;
elseif size(P_mat)==[0,0];
avg=NaN;
%Create distribution fit
pdf=fitdist(P_mat,'Normal');
avg=pdf.mu;
end
P_pdf=[P_pdf;avg];
end
0 个评论
采纳的回答
Stephen23
2017-1-21
编辑:Stephen23
2017-1-21
This is a classic example of how badly formatted code makes buggy code. When the code is formatted using MATLAB's default formatting rules (select all, ctrl+i), then the cause is much easier to spot:
P_pdf = [];
%Inices with NaN
idxnan = find(isnan(P_b));
for i = 1:size(idxnan, 1) - 1
%Indices of numeric values
idxlow = idxnan(i) + 1;
idxup = idxnan(i + 1) - 1;
%Group P_b Matrices according to NaN values
P_mat = P_b(idxlow:idxup);
%
%Reject empty matrices and treat singular values
if size(P_mat) == [1, 1];
avg = P_mat;
elseif size(P_mat) == [0, 0];
avg = NaN;
%Create distribution fit
pdf = fitdist(P_mat, 'Normal');
avg = pdf.mu;
end
P_pdf = [P_pdf; avg];
end
Now it is clear that there is an if and an elseif, but if neither of these conditions have been fulfilled then there is no else and so avg never gets defined. The error is due to testing the matrix size like this:
size(P_mat) == [0, 0]
which is not every going to be true when P_mat is created by indexing like this:
P_mat = P_b(idxlow:idxup);
Try it yourself at home:
>> V = 1:3;
>> size(V(2:1))
ans =
1 0
So that test ==[0, 0] will always fail. The logic is bad anyway: surely you want to test for non-empty vectors and apply the fit to them?
Here is a slightly more robust version of your loop:
P_b = [2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;NaN];
idn = isnan(P_b);
idd = diff(idn);
idb = find([~idn(1);idd<0])
ide = find([idd>0;~idn(end)])
out = NaN(size(idb));
for k = 1:numel(idb)
tmp = P_b(idb(k):ide(k));
pdf = fitdist(tmp,'Normal'); % untested, I don't have fitdist
out(k) = pdf.mu; % untested
end
Personally I would not write all of that code: I would simply split the input vector using accumarray, and then use cellfun to do whatever processing:
P_b = [2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;NaN];
idx = isnan(P_b);
idy = cumsum([1;diff(idx)>0]);
C = accumarray(idy(~idx),P_b(~idx),[],@(n){n});
D = cellfun(@(v)fitdist(v,'Normal'),C); % untested: I don't have fitdist
P_pdf = arrayfun(@(s)s.mu,D) % untested
It might be required to get cellfun to return a cell array:
D = cellfun(@(v)fitdist(v,'Normal'),C,'Uni',0); % untested
P_pdf = cellfun(@(s)s.mu,D) % untested
2 个评论
Stephen23
2017-1-21
编辑:Stephen23
2017-1-21
Check how large the selection is like this:
P_b = [2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;NaN];
idn = isnan(P_b);
idd = diff(idn);
idb = find([~idn(1);idd<0])
ide = find([idd>0;~idn(end)])
out = NaN(size(idb));
for k = 1:numel(idb)
tmp = P_b(idb(k):ide(k));
if isempty(tmp)
out(k) = NaN;
elseif isscalar(tmp)
out(k) = tmp;
else
pdf = fitdist(tmp,'Normal');
out(k) = pdf.mu;
end
end
更多回答(1 个)
Kosta
2017-1-21
1 个评论
Stephen23
2017-1-21
编辑:Stephen23
2017-1-21
Note that this code is not robust (e.g. it cannot cope with sequential NaN), nor efficient due to the concatenation inside the loop. In particular this is very poor code:
size(P_mat)==size(zeros(0,1))
Hard to read, hard to comprehend, and pointlessly complicated. See my answer and comments for much simpler code.
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!