if loop within for loop for statistical analysis of data

1 次查看(过去 30 天)
Hi,
I am having a code with data, that consists of a very large column vector in the form of:
P_b=[2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;Nan];
For that vector, I would like to group all consecutive non-NaN values, i.e. [2;3;4;5;6],[3;4;5;6] etc. fit a normal distribution to them, extract the mean, and have the result come up in a vector. This vector includes all the means of the 'grouped' data of P_b.
May sound kind of complicated but it shouldn't be. I have created the code below, however an odd problem that arrises is that MATLAB does not recognise the variable 'avg', when at the end of the for-loop, I am trying to save all for-loop results in a vector. However when I run the code without that last line, it seems to recognise the variable 'avg'. Any ideas? Thanks in advance for your help. Below is the code.
P_pdf=[];
%Inices with NaN
idxnan=find(isnan(P_b));
for i=1:size(idxnan,1)-1
%Indices of numeric values
idxlow=idxnan(i)+1;
idxup=idxnan(i+1)-1;
%Group P_b Matrices according to NaN values
P_mat=P_b(idxlow:idxup);
%Reject empty matrices and treat singular values
if size(P_mat)==[1,1];
avg=P_mat;
elseif size(P_mat)==[0,0];
avg=NaN;
%Create distribution fit
pdf=fitdist(P_mat,'Normal');
avg=pdf.mu;
end
P_pdf=[P_pdf;avg];
end

采纳的回答

Stephen23
Stephen23 2017-1-21
编辑:Stephen23 2017-1-21
This is a classic example of how badly formatted code makes buggy code. When the code is formatted using MATLAB's default formatting rules (select all, ctrl+i), then the cause is much easier to spot:
P_pdf = [];
%Inices with NaN
idxnan = find(isnan(P_b));
for i = 1:size(idxnan, 1) - 1
%Indices of numeric values
idxlow = idxnan(i) + 1;
idxup = idxnan(i + 1) - 1;
%Group P_b Matrices according to NaN values
P_mat = P_b(idxlow:idxup);
%
%Reject empty matrices and treat singular values
if size(P_mat) == [1, 1];
avg = P_mat;
elseif size(P_mat) == [0, 0];
avg = NaN;
%Create distribution fit
pdf = fitdist(P_mat, 'Normal');
avg = pdf.mu;
end
P_pdf = [P_pdf; avg];
end
Now it is clear that there is an if and an elseif, but if neither of these conditions have been fulfilled then there is no else and so avg never gets defined. The error is due to testing the matrix size like this:
size(P_mat) == [0, 0]
which is not every going to be true when P_mat is created by indexing like this:
P_mat = P_b(idxlow:idxup);
Try it yourself at home:
>> V = 1:3;
>> size(V(2:1))
ans =
1 0
So that test ==[0, 0] will always fail. The logic is bad anyway: surely you want to test for non-empty vectors and apply the fit to them?
Here is a slightly more robust version of your loop:
P_b = [2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;NaN];
idn = isnan(P_b);
idd = diff(idn);
idb = find([~idn(1);idd<0])
ide = find([idd>0;~idn(end)])
out = NaN(size(idb));
for k = 1:numel(idb)
tmp = P_b(idb(k):ide(k));
pdf = fitdist(tmp,'Normal'); % untested, I don't have fitdist
out(k) = pdf.mu; % untested
end
Personally I would not write all of that code: I would simply split the input vector using accumarray, and then use cellfun to do whatever processing:
P_b = [2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;NaN];
idx = isnan(P_b);
idy = cumsum([1;diff(idx)>0]);
C = accumarray(idy(~idx),P_b(~idx),[],@(n){n});
D = cellfun(@(v)fitdist(v,'Normal'),C); % untested: I don't have fitdist
P_pdf = arrayfun(@(s)s.mu,D) % untested
It might be required to get cellfun to return a cell array:
D = cellfun(@(v)fitdist(v,'Normal'),C,'Uni',0); % untested
P_pdf = cellfun(@(s)s.mu,D) % untested
  2 个评论
Kosta
Kosta 2017-1-21
Thanks, this stupid mistake I made does indeed solve part of the problem. However I still can't get this to work. The P_mat does not seem to be treated every time by the if statement for some reason, resulting to a blank P_mat.
Stephen23
Stephen23 2017-1-21
编辑:Stephen23 2017-1-21
Check how large the selection is like this:
P_b = [2;3;4;5;6;NaN;3;4;5;6;NaN;3;4;2;NaN;3;NaN];
idn = isnan(P_b);
idd = diff(idn);
idb = find([~idn(1);idd<0])
ide = find([idd>0;~idn(end)])
out = NaN(size(idb));
for k = 1:numel(idb)
tmp = P_b(idb(k):ide(k));
if isempty(tmp)
out(k) = NaN;
elseif isscalar(tmp)
out(k) = tmp;
else
pdf = fitdist(tmp,'Normal');
out(k) = pdf.mu;
end
end

请先登录,再进行评论。

更多回答(1 个)

Kosta
Kosta 2017-1-21
Got this whole thing working like this finally. Thanks again for your help:
P_pdf=[];
%Inices with NaN
idxnan=find(isnan(P_b));
for i=1:size(idxnan,1)-1
%Indices of numeric values
idxlow=idxnan(i)+1;
idxup=idxnan(i+1)-1;
%Group Power Matrices according to NaN values
P_mat=P_b(idxlow:idxup);
%Reject empty matrices and treat singular values
if size(P_mat)==[1,1];
avg=P_mat;
elseif size(P_mat)==size(zeros(0,1));
avg=NaN;
else
%Create distribution fit
pdf=fitdist(P_mat,'Normal');
avg=pdf.mu;
end
P_pdf=[P_pdf;P_mat];
end
  1 个评论
Stephen23
Stephen23 2017-1-21
编辑:Stephen23 2017-1-21
Note that this code is not robust (e.g. it cannot cope with sequential NaN), nor efficient due to the concatenation inside the loop. In particular this is very poor code:
size(P_mat)==size(zeros(0,1))
Hard to read, hard to comprehend, and pointlessly complicated. See my answer and comments for much simpler code.

请先登录,再进行评论。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by