removing specified data from variable
6 次查看(过去 30 天)
显示 更早的评论
I have a 100x2 dataset I am working with. I also have 2 random distributions of data.
I want to modify my original dataset in the following way:
- randomly generate a number from random distribution 1 and keep this many rows of the data.
- randomly generate a number from random distribution 2 and remove this many rows of the data
I want to do this for the full length of the dataset.
Can anybody help me define this?
time = [1:1:100];
var = rand(100,1);
data = [time' var]; %dataset
dist1 = 1 + (20-1).*rand(100,1); %random distribution 1
dist2 = 10 + (30-10).*rand(100,1); %random distribution 2
position1 = randi(length(dist1));
card1 = dist1(position);
position2 = randi(length(dist2))l
card2 = dist2(position);
8 个评论
回答(2 个)
Davide Masiello
2022-11-7
编辑:Davide Masiello
2022-11-7
I think the following code is a simpler way of achieving your task, but it does not implement the "pulling a number from a random distribution", because honestly I still do not understand what that would be for.
Instead, at each iteration it generates a random integer (max 20) and that would be the new increment of rows to either keep or remove.
See below the code with printed text describing the action at each iteration.
data = [(1:100)' rand(100,1)] % Dataset
datanew = [];
distribution1 = randi(100,100,1); % Array of random integers (to be replaced with gaussian distribution later)
distribution2 = randi(100,100,1); % Array of random integers (to be replaced with gaussian distribution later)
index = 0;
iter = 1;
while index < size(data,1)
fprintf('This is iteration number %d.\n',iter)
if isequal(mod(iter,2),1)
increment = min(distribution1(randi(length(distribution1),1,1)),size(data,1)-index);
fprintf('The random number is %d.\n',increment)
fprintf('We keep the rows between %d and %d.\n',[index+1,index+increment])
datanew = [datanew;data(index+1:index+increment,:)];
else
increment = min(distribution2(randi(length(distribution2),1,1)),size(data,1)-index);
fprintf('The random number is %d.\n',increment)
fprintf('The rows between %d and %d do not get added to the new dataset.\n',[index+1,index+increment])
end
iter = iter+1;
index = index+increment;
end
size(data)
size(datanew)
5 个评论
Davide Masiello
2022-11-7
But why do you first generate a random distribution and then randomly take a value from it?
How is this different from just generating a random number.
I.e.
how is this
distribution1 = randi(10,100,1); % array of 100 random integers from (max val. = 10)
a = distribution1(randi(100,1,1)) % integer randomly pulled from distribution 1
different from this
a = randi(10,1,1) % random integer between 1 and 10
Davide Masiello
2022-11-7
Ok I see now, sorry I must have skipped that part.
I have modified my answer so that the number of rows to keep/remove is pulled randomly from the vectors which I called distribution1 and distribution2.
These are random vectors, you can replace them with the gaussian distributions at your discretion.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Random Number Generation 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!