Random Number Generation for Parallel Computing Toolbox

19 次查看(过去 30 天)
I am running monte carlo simulations and use multiple chains. To run the chains in parallel, I open a worker for each chain and use a parfor loop. The probelm is each time I run the code, the randomized initial values are the same. I have tried using the rng function but this does not seem to work when using the parallel computing toolbox. Is there a way to randomize the starting points for each matlabpool worker?
Thank you, Stephen
  1 个评论
John Fox
John Fox 2017-7-20
I had the exact same problem. My for loops gave a different answer than my parfor loops. The reason is
As described in Control Random Number Streams, each worker in a cluster has an independent random number generator stream. By default, therefore, each worker in a pool, and each iteration in a parfor-loop has a unique, independent set of random numbers. Subsequent runs of the parfor-loop generate different numbers.
I fixed this with rng(123,'twister'). At least this worked for me.

请先登录,再进行评论。

回答(7 个)

Jill Reese
Jill Reese 2012-11-8
The R2012b documentation provides a section on controlling the random number streams on the client and on the workers. If it does not address your use case, that would be helpful to know so that we can improve it in future.
Best,
Jill

Peter Perkins
Peter Perkins 2012-11-8
Just to be clear, MATLAB initializes the random number generators on each worker so that they are definitely not the same, and suitable for parallel computation. In many cases, (needing reproduceablility being one common exception), it should normally not be necessary to worry about initializing them.
It may be that something in your code is doing something to spoil that. The link Jill pointed to should help.
  15 个评论
Ebru Angun
Ebru Angun 2022-3-12
I have a related question for independent and reproducible random number generation in parallel computing.
1- Can I use 'Threefry' instead of 'mrg32k3a' below?
stream = RandStream('mrg32k3a','Seed',seed);
parfor ii = 1:10
set(stream,'Substream',ii);
par(ii) = rand(stream);
end
2- In the parfor loop, I am using 'normrnd' and 'mvnrnd' which need 'rng' to set the seed. How can I make the 'normrnd' and 'mvnrnd' functions use the substream on a local worker?
Thanks in advance.
Ebru
Ebru Angun
Ebru Angun 2022-3-12
If we have to run a single program on 60 different randomly generated data, is it a better idea to use rng command instead of creating 60 substreams as follows? We have 12 workers (so each time at most 12 problems can be solved), and the workers do not communicate with each other. The important issue here is to obtain 60 non-overlapping (independent) and reproducible random number streams that can be used with functions such as 'normrnd' and 'mvnrnd'. Thanks in advance.
parpool(60)
parfor i=1:60
rng(i);
r=normrnd(mu,sigma);
end
delete(gcp);

请先登录,再进行评论。


Peter Perkins
Peter Perkins 2014-9-18
If I'm understanding correctly, the problem is that, just as with ordinary non-parallel MATLAB, the random numbers on each worker are the same each time you start up (the random number generators are set up using each worker's labindex). If you are doing one calculation in one session, that's fine. But if you want to combine results of MC simulations from multiple sessions, and be able to treat them as statistically independent, then obviously that is a problem.
If that's right, then the solution is to (re)initialize the generator differently on each worker each time you start it up, using pctrunonall. "Differently on each worker each time you start it up" can be achieved using something involving 'shuffle', but it's theoretically possible to get the same initialization in two places by random chance. So a better idea is a combination of labindex and some sort of unique session number.
Just as in the serial case, you could use rng(i), where i is based on the lab index and the session number. But there are parallel generators that are designed specifically for this kind of large-scale MC simulation context: mrg32k3a and mlfg6331_64. If you know how many workers and sessions, then do something like this:
stream = RandStream.create('mrg32k3a','NumStreams',workers*sessions, ...
'StreamIndices',workers*session+worker)
That gives you statistical independence across workers, across sessions. That will work for those two generators. With a non-parallel generator like mt19937ar, your only course would be to use different seeds, but again you could base the seeds on labindex and the session number.
Hope this helps.

Daniel Golden
Daniel Golden 2015-2-25
Try something like this to shuffle the random number generator on the local worker and on all the parallel workers:
pool = gcp;
rng('shuffle'); % Shuffles on local worker only
% Shuffle on each parallel worker
seed_offset = randi(floor(intmax/10));
parfor kk = 1:pool.NumWorkers
rng(kk + seed_offset);
end
Tested on R2014b

Matteo
Matteo 2016-3-9
I experienced this problem several times. The proposed approach, that is set the seed as 100*clock was the solution. However, when the loop run fast, it is better to increase the multiplier, otherwise the seed will not change at every iteration.

Chuck
Chuck 2016-5-5
It does work with Parallel Computing Toolbox. Just add rng("shuffle") after the parfor line.
It might be because your version, since this post is from 2012.

Chibuzo Nnonyelu
Chibuzo Nnonyelu 2018-3-10
One way to approach this is to generate the random numbers just before the parfor-loop. This may use for memory depending on the size of the parfor-loop

类别

Help CenterFile Exchange 中查找有关 Parallel Computing Fundamentals 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by