Randomly generate integers with a non-uniform distribution
6 次查看(过去 30 天)
显示 更早的评论
I am trying to generate a matrix of random integers of 1 to 4, but I would like to define the distribution rather than it being uniform. I have managed to generate a 5x5 matrix but it shows uniform distribution. Can I specify the distribution (e.g. 10% of 1s, 20% of 2s, 30$ of 3s and the rest of them 4s)? Thank you.
randi([1,4],5,5)
0 个评论
回答(1 个)
Walter Roberson
2018-4-1
randsample() and specify a weights vector.
7 个评论
Walter Roberson
2018-4-2
I forgot to mention that the code requires R2016b or later.
rand(arraysize) is (for example) 3 x 3, which is a 2 dimensional array. We need to compare every element in it to every element of cumsum(weights), here length 4, temporarily getting back a 3 x 3 x 4 array. To do that we reshape the 1 x 4 vector cumsum(weights) to be 1 x 1 x 4, and the easiest way to do that is to bring its size in the third dimension (1 x 4 is the same as 1 x 4 x 1) to the front to make 1 x 1 x 4. Another way of doing the permute() would be to to use
reshape(cumsum(weights), 1, 1, length(weights))
So now we have a 3 x 3 array of random values, which is also a 3 x 3 x 1 array, and we have the 1 x 1 x 4 vector of values to compare against. Those are different sizes, but with R2016b or later we can take advantage of automatic expression along the first dimension that is length 1 in one of the two operands and is not length 1 in the other operand. In this case the third dimension of the 3 x 3 (which is also 3 x 3 x 1) is length 1 in one of them but is length 4 in the other. So each element of the random values will be compared against the 4 different cumulative weights, giving a 3 x 3 x 4 result. You can also write the operation as
sum( bsxfun(@ge, rand(arraysize), reshape(cumsum(weights), 1, 1, length(weights))), 3 ) + 1
the sum() along the third dimension (the 3) returns back a 2D array, in this case 3 x 3. This gives the number of entries in the cumulative sum that the random number exceeded. 0 means that the value was smaller than the first entry in the cumulative weights table, 1 means that the value was larger than the first entry in the cumulative weights table but smaller than the second, and so on. rand() cannot exceed 1 (and cannot exactly reach 1 either) so you cannot possibly reach 4 with the sum, so you get 0 to 3 values. Add 1 to those to get 1 to 4, which are the indices of the entries to look up.
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!