Call parfeval using global variables?

I am using parfeval() to do multithread logic. I have 16 cores. The documention mentions listing the calling parameters as arguments in parfeval, for example:
for ii = 1:partition
f(ii) = parfeval(@func, 1, a, b, c);
end
My three a b c arguments happen to be globals. Can I call parfeval this way instead, and pick up the globals in func? The return value is also a global.
for ii = 1:partition
f(ii) = parfeval(@func,0);
end
I suspect not, since my code isn't working.
Also, one of the three arguments is a pathname to a DOS batch file, which I execute with a System call. Is this possible across multiple cores?

 采纳的回答

Using a combination of spmd and spmdIndex (NOT labIndex, which is obsolete), I was able to achieve what I wanted: setting the DOS path environment for each worker individually. You can also write messages from each worker to a shared log file:
parpool(4);
arraystuff = ["Users","Home","Lib","exe");
execution_path = 'C:/Users/kurt/matlab/batch_files';
run_path = strcat(execution_path, '/run_cases.bat');
logfile = 'C:/Users/kurt/logfile.txt';
spmd
switch spmdIndex
case 1
if(ispc) % running on a PC?
setenv('input_path', arrayStuff(1,spmdIndex)); % set the DOS environment
end
system(run_path, '-echo'); % do some work on this core
msg = fopen(logfile, 'a'); % open diagnostic log
fprintf(msg, '%s %s %s %s', datetime('now'), ' worker ', num2str(spmdIndex), ' completed');
fclose(msg);
case 2
if(ispc) % running on a PC?
setenv('input_path', arrayStuff(1,spmdIndex)); % set the DOS environment
end
system(run_path, '-echo'); % do some work on this core
msg = fopen(logfile, 'a'); % open diagnostic log
fprintf(msg, '%s %s %s %s', datetime('now'), ' worker ', num2str(spmdIndex), ' completed');
fclose(msg);
case 3
% etc
case 4
%etc
end
end
The 'system', 'fprintf' and 'setenv' functions are not necessary, but illustrate some capabilities for working with remote cores. Printing to a log file is one way to debug multithreaded code, since you can't set breakpoints. Note that system calls only work in Process pools, not Thread pools.

1 个评论

Another useful technique:
If a worker process crashes, it is well-nigh impossible to return an error message to the main CPU unless you pass it to a log file as described above. You can encapsulate each worker code block in a try-catch-end block, and write the error message to your log file as follows:
case 14 % worker number 14
try
do_some_stuff(arg1, arg2...)
catch ME
msg = fopen(logFile, 'a');
fprintf(msg, '%s', ME.identifier);
fclose(msg);
end
case 15...

请先登录,再进行评论。

更多回答(1 个)

When you pass a global as a parameter, what is received in the called routine is treated as a local. For example, assigning to that parameter does not change the global variable.
This is the case when using parfeval as well. The global a, b, c in your call would be received as local.
To emphasize: if you had
function result = func(a, b, c)
result = some_internal_function;
end
function result = some_internal_function
global a b c
result = a*25 + b*5 + c;
end
then the fact that you passed global variable a into func does not mean that a gets treated as global inside some_internal_function. You would have to code something like
function result = func(a_in, b_in, c_in)
global a b c
a = a_in; b = b_in; c = c_in;
result = some_internal_function;
end
and then a b c would become global within that worker for as long as the worker continues to live. But if some_internal_function modified (say) b then the change would not affect any other worker and would not affect the client.

10 个评论

That makes sense. But back to the original question, I assume I must treat parfeval as documented, passing variables in as arguments - not just expecting to find them there as globals?
This whole scheme is a tall order. I need to execute a collection of DLLs on each core, and I'm not sure how to get them installed there. The DOS batch file I mentioned executes these DLLs, generating output back to the main core.
In the original Python implementation, I just racked up about 50 threads on the main core, maxing it out to 100% CPU. But this is a very different approach in Matlab.
This just in: apparently function 'system' is not supported on a thread-based worker.
In my @func code, I have the following line:
system(batch_pathname, '-echo');
This will make it very difficult to run DLLs on the other cores.
Would a process-based approach work better for me, perhaps?
Process-based parfor is able to system()
There is no possibility in current versions that a global variable will be copied in to a process-based worker . Looks like the same is true for background pools
pool = backgroundPool()
pool =
BackgroundPool with properties: NumWorkers: 2 Busy: false
global X
X = 2;
for K = 1 : 3
F(K) = parfeval(pool, @mycode, 1, X);
end
wait(F);
G = fetchOutputs(F, 'uniform', false)
G = 3×1 cell array
{[NaN]} {[NaN]} {[ -4]}
X
X = 2
function result = mycode(inval)
global X
if isempty(X)
result = NaN;
X = -3;
else
X = X + X;
result = inval + X;
end
inval = 7; %does modifying this change the global ?
end
In the above, the reason that the third output is numeric is that there are two workers in the pool, so each of them gets an independent global workspace . Each of those starts with an empty global workspace, so isempty(X) finds empty. But there are three tasks, so the third task gets the existing global workspace of whichever of the two workers it ends up executing in, and the previous task modified its global workspace.
So when a worker pool is created, each worker gets a global workspace that is empty. If something happens to modify the global workspace by way of code executing on the worker then that worker's global workspace will be updated (without affecting anything else.) As long as the worker exists, changes on the worker to its global workspace will persist.
But you can see that even though we passed the global in as parameter, modifying it (inval = 7) did not have any global effect, and changes to the globals that live on the workers did not have any effect on the client.
You mention DOS. I wonder if System.Diagnostics.Process is supported on background pool workers?
I seldom boot Windows these days, so this is not something I can easily test.
Okay, I got as far as executing my DOS batch file via a System call. However, the batch file references a whole bunch of pathnames for inputs, libraries, executables and such which don't exist in this remote Processes pool. The batch file returns an error message, which I am able to write to a log file (disp and fprintf don't work here, remember).
Is there something analogous to the LINUX SSH command that will allow me to access other cores? I can run my executable code on the remote core, as long as I have access to the libraries and data files on the main core. Or something.
Or maybe I should just go back to using a parallel For loop?
However these are not designed to connect to "cores", only to IP addresses.
You can do things like tcpclient (available in base MATLAB) and tcpserver (Instrument Control Toolbox)
You can use addAttachedFiles on a parallel pool -- and my tests show you can do it on a backgroundPool as well. It does not matter whether you then use parfor with the pool or if you use parfeval with the pool: the files will already exist in the workers either way.
Well, we're getting closer (maybe). I am now attempting to use spmd to run my code:
execute_path = 'C:/Users/username/batchfiles/';
execute_pathname = strcat(execute_path, 'run_cases.bat');
spmd
system(execute_pathname, '-echo');
end
I am runing this in Process mode, so system calls are legal. The difficulty arises because the run_cases.bat file includes paths and references to various libraries and .exe executables that don't exist on each core. The batch file also references various environment variables that I set with the Matlab setenv() function.
Do I need to set up each remote core CPU with the same directory structure, libraries and executables as the main core? If so, how?
By the way, when I run this code with spmd(0) (running single thread on the main core), it works.
What difficulty did you encounter with addAttachedFiles
Walter, I did not pursue that path. I was able to get the code to work as described in the previous comment without having to worry about all the issues I had questions about.
However, I still have one step to complete: Currently all 16 cores are running the identical software and producing the identical results. What I want to happen is for each core to work on a different set of input and output files, simultaneously. I'm not sure how to do that. I need some kind of loop within the spmd block to modify the pathnames for each core, something like this:
spmd
for each core % not all simultaneously:
setenv('input_path', input_pathname);
setenv('output_path', output_pathname);
system(execute_pathname, '-echo');
end
end
(later): I saw your post on labindex at
Maybe that will work?

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Parallel Computing Fundamentals 的更多信息

产品

版本

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by