How to replace some of the value in the matrix with NaN?

13 次查看(过去 30 天)
The simple case is like this:
2 1 4 6 2
9 4 6 1 2
5 3 2 8 3
7 2 1 9 3
7 1 8 2 4
From the matrix above, i want to insert 3 NaNs in random place. So, my code is like this:
Data = [2,1,4,6,2;9,4,6,1,2;5,3,2,8,3;7,2,1,9,3;7,1,8,2,4];
[rows,cols] = size(Data);
p = 3; %amount of NaN that will we inserted
r = randperm(25); %give the random value from range 1-25
r = r(1:3); %give 3 random number from range 1-25
i = 1;a = 1; b = 1;
while i <= 3 %generate every number in vektor r to be position where NaN is located
n = r(a,b);
b = b+1;
e = 1;
if n <= cols
Data(1,n) = NaN;
else
if n > cols
while n > cols
e = e+1;
k = n - cols;
n = k;
end
Data(e,n) = NaN;
end
end
i = i+1;
end
The output one of the output will be like this:
2 1 4 6 2
9 NaN NaN NaN 2
5 3 2 8 3
7 2 1 9 3
7 1 8 2 4
So, i want to make some constraint such as:
1. every row only can have 2 NaN
2. amount NaN in column 1 have to be less then column 2, and amount NaN in column 2 have to be less then column 3, and so on. eg. output matrix will be like this:
2 1 4 6 2
9 4 6 1 NaN
5 3 2 8 3
7 2 1 NaN 3
7 1 8 2 NaN
for matrix above we can see that:
amount NaN of column 1= 0, column 2=0, column 3=0, column 4=1, column 5= 2.
Somebody can help me to insert those my constraint into my code above? Or there willl be another solution i think.
Thanks before :')
  2 个评论
Isti
Isti 2012-4-22
no i don't. actually i'm new in using matlab :(
could you help me more about that? or somehow it'll help me in my problem.

请先登录,再进行评论。

回答(3 个)

per isakson
per isakson 2012-4-22
This is an idea that I have not tested!
jj = 0;
for ii = r
[rr,cc] = ind2sub( size(Data), ii )
if sum(isnan(Data(rr,:))>=2 || sum( isnan(Data(:,cc))>=2
% do nothing
else
Data(rr,cc)=nan;
jj = jj + 1;
if jj = 3, break
end
end
end
--- EDIT ---
The function below will return a result. The constraint is "no more than two NaN in any column or row. However, that was not what you asked for.
function Data = cssm
Data = [2,1,4,6,2;9,4,6,1,2;5,3,2,8,3;7,2,1,9,3;7,1,8,2,4];
p = 3; %amount of NaN that will we inserted
row_vector = randperm(numel(Data));
jj = 0;
for ii = row_vector
[rr,cc] = ind2sub( size(Data), ii );
if sum(isnan(Data(rr,:)))>=2 || sum( isnan(Data(:,cc)))>=2
% do nothing
else
Data(rr,cc)=nan;
jj = jj + 1;
if jj == p, break
end
end
end
end
With the constraint, "amount NaN in column 1 have to be less then column 2, and amount NaN in column 2 have to be less then column 3, and so on.", there is no solution. Do you exclude columns with zero NaN from that constraint?
Thus, (according to my reading) the last column can have two or three NaN and the second last column one or zero NaN. NaN cannot not appear in the other columns.
  4 个评论
Isti
Isti 2012-4-22
ooh, i think your suggestion code isn't fulfill my second constraint :(
Isti
Isti 2012-4-24
of course not, the columns with zero NaN also included. and so when the column have zero NaN, it will in the very left column of the matrix.
btw, what's the used of ind2sub above. i can't get it yet

请先登录,再进行评论。


Richard Brown
Richard Brown 2012-4-22
This is another one of these problems where the simplest way to solve it is to randomly generate candidates until you find one that fits:
A = reshape(randperm(25), 5, 5);
done = false;
while ~done
idx = randperm(25, 3);
[I, J] = ind2sub([5 5], idx);
m = hist(I, unique(I));
n = hist(J, unique(J));
done = all(m <= 2) && all(diff(n) >= 0);
end
A(idx) = NaN;
It's trivial (but a little messier) to make it more general, so I'll leave you to do that if you need to.
EDIT changed code to use randperm instead of randi - only one call to the random number generator is necessary
  1 个评论
Isti
Isti 2012-4-28
thanks for this answer. actually it works in my smal dataset. but, for my medium dataset (such 1500rows*11columns of data) and more amount of NaN to be insert, it takes very long time. and even i decided to cancel it :(
if i cut the 2nd constraint and only want to use the 1st constraint, is there any way to make it faster?
thanks before.

请先登录,再进行评论。


Richard Brown
Richard Brown 2012-4-29
Here's a much faster method that satisfies both of your constraints. It may be possible to vectorise the loop, but it is, in my opinion, not worth the effort.
First, generate the data
X = rand(1500, 11);
[m,n] = size(X);
nNans = 2000;
We figure out the row and column indices separately. Rows is easy, a single call to randperm does the trick
I = mod(randperm(2*m, nNans), m) + 1;
Then figure out the column positions randomly, going row by row to avoid creating duplicate entries.
J = zeros(1, nNans);
k = 1;
for i = 1:m
idx = (I == i);
J(idx) = randperm(n, nnz(idx));
end
We now need to make sure the columns are ordered correctly. So we construct a logical matrix encoding the position of the NaN entries, and reorder the columns to satisfy your column constraint.
iNan = false(m, n);
iNan(sub2ind([m n], I, J)) = true;
[~, iSorted] = sort(hist(J, 1:n));
iNan = iNan(:, iSorted);
We now have a logical array with the right properties. Last step is to overwrite the entries of X
X(iNan) = nan;

类别

Help CenterFile Exchange 中查找有关 Loops and Conditional Statements 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by