Looking for something like a matrix version of randsample... [vectorization!]

12 次查看(过去 30 天)
If I have a vector w (length n) and I want to pick a single random number between 1 and n using w as the weights, I can do something like
idx = randsample(n,1,true,w)
and I'll get a number between 1 and n with probability w(idx)/sum(w). Great.
Similarly, I have a matrix W (size N x M, where each of N,M is in the thousands or so), and I want to draw M random numbers between 1 and N, with the columns of W acting as independent weight vectors. I could obviously do
idx = zeros(N,1);
for i = 1:M
idx(i) = randsample(N,1,true,W(:,i));
end
...but I'm going to be calling this literally billions of times, so I'm looking for some efficiency.
I know that an equivalent way to think of this is to take my W matrix, normalize the columns so that they sum to one, do a cumsum on the columns, select a vector of uniform random numbers using rand(1,M), and find the first row indices where they are greater than the cumsum values, but I don't know how to do that without using a loop and find():
W_normalized = bsxfun(@rdivide,W,sum(W,1));
W_cdf = cumsum(W,1);
x = rand(1,M);
C = bsxfun(@lt,x,W_cdf);
and then the first row of each column of C with a 1 in it is my random number, but I haven't had any luck doing that in an efficient, vectorized way (I've seen this thread, but I think their conclusion to use a for-loop doesn't really seem to hold for larger matrices).
Any suggestions?
Thanks, Dan

采纳的回答

Kirby Fears
Kirby Fears 2015-11-25
编辑:Kirby Fears 2015-11-30
Here's a pure matrix version of your bsxfun calls. It should be internally parallelized if your matrices are large.
W_normalized = W./repmat(sum(W,1),size(W,1),1);
W_cdf = cumsum(W_normalized,1);
x = rand(1,size(W,2));
C = repmat(x,size(W,1),1)<W_cdf;
You still need to find the first "true" value in C for each column to form an index.
Below I make a logical array where only the first "true" is present in each column:
idx = [C(1,:); xor(C(2:end,:),C(1:end-1,:))];
Hope this helps.
  4 个评论
Dan Gianotti
Dan Gianotti 2017-3-8
I just realized that I never "accepted" this answer. Thanks very much for the huge assistance! You presumably saved me a hundred years of CPU time.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Logical 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by