Parpool shared data access

8 次查看(过去 30 天)
RevHardt
RevHardt 2015-1-20
回答: Raghav Bansal 2024-11-26,6:34
I use:
if isempty(gcp('nocreate'))
parpool([ 1, Inf ]);
end
... to create my parpool in my wrapper function, which gives me 4 workers on my desktop. The wrapper function calls a file foo.m, each copy of which in turn calls bar.m several times.
The wrapper function generates heavy data, which is required in bar.m on a purely read-only basis:
  • wrapper.m:
genSpline = griddedInterpolant({ gridData.xgv, gridData.ygv, gridData.zgv }, gridData.data, 'spline', 'spline');
  • bar.m:
val = genSpline(interPts);
When passed as an argument to bar.m via foo.m, each worker in the pool maintains its own private copy of genSpline, causing an enormous memory leak thanks to redundant data. However, the program works fine as such.
In an effort to work around this, I prefixed the def and use of gridData and genSpline with:
global gridData genSpline;
... as the documentation seems to suggest. However, this fails with:
'Subscript indices must either be real positive integers or logicals.'
... in bar.m. Reverting to passing via arguments proves that there is nothing wrong with interPts. Printlining the def and use of the version with the global variable gives this:
  • wrapper.m:
genSpline =
griddedInterpolant with properties:
GridVectors: {[1x41 double] [1x41 double] [1x12 double]}
Values: [41x41x12 double]
Method: 'spline'
ExtrapolationMethod: 'spline'
  • bar.m:
genSpline =
[]
... implying that either the global variable isn't being set properly, or for some reason is inaccessible to bar.m. There is no distributed network involved, and all files are within the same directory, which is on the MATLAB (R2014a 64-bit UNIX) path. Any suggestions?
PS: The same approach towards declaring and using global variables works with a 'regular' 2x2 matrix.

回答(1 个)

Raghav Bansal
Raghav Bansal 2024-11-26,6:34
Hi RevHardt,
The issue seems to be from the fact that global variables are not shared between workers in a MATLAB parallel pool. Each worker operates in its own separate process, so global variables defined in one process are not visible to others.
There are two workarounds for this:
1) You can use a parallel.pool.Constant to store genSpline once in shared memory accessible by all workers. You can refer to the following MATLAB documentation to learn more about this:
2) You can use "spmd" or a similar mechanism to broadcast the genSpline data to all workers explicitly.You can refer to the following MATLAB documentation to learn more about this:
I hope this helps.

类别

Help CenterFile Exchange 中查找有关 Parallel Computing Fundamentals 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by