Issues with reproducibility in multistart with parallelization
13 次查看(过去 30 天)
显示 更早的评论
I am running several model fits (1139) using multistart with parallelization. I first ran with 50 start points (including my initial guess). I then wanted to re-run with 150 start points, to compare the reduction in fval. So, when running the first 50, I batched my fits manually across a handful of nodes, and saved the rng state to a mat file. When running the 150, I batched the fits in the same way, except for a set of about 100 where that node was unavailable, and loaded the rng state.
To test the effect of the node (I think I read the node influences the number generator, and was curious what would happen with reproducibility) I ran that set of 100 fits across two nodes, loading the same state each time. In this case, I got identical outputs, identical function values.
However, I did not get good reproducibility from the 50 start point run to the 150 start point run. 27% of the 1139 fits were worse (had higher function values in a minimization problem) than the 50 start point fits. I also found that of the 1139 fits, 17% had greater than 1% higher fval, and 3% had 10% greater fval - I thought maybe its rounding or something, but this seems pretty high.
What am I missing? How can I make these fits reproducible?
0 个评论
回答(1 个)
Matt J
2025-9-2
编辑:Matt J
2025-9-2
I don't see why you would expect agreement between a 50-point multistart and a 150-point multistart. Only if both versions succeed in finding the global minimum would the results be guaranteed to agree.
4 个评论
Matt J
2025-9-3
You are quite welcome, but when/if you are convinced this is the correct answer please accept-click it.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Global or Multiple Starting Point Search 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!