Having problem in running parfor loop on my system, getting messages" A worker aborted during execution of the parfor loop. "?

30 次查看(过去 30 天)
I have a code that uses parfor. Intrestingly when I use that code earlier on another machine on windows it does not give an error and was run properly.
However, when I try to run that code on a new machine also on windows it always gives me the following error:
Warning: A worker aborted during execution of the parfor loop. The parfor loop will now run again on the remaining workers.
> In parallel.internal.parfor/ParforEngine/handleIntervalErrorResult (line 285)
In parallel.internal.parfor/ParforEngine/getCompleteIntervals (line 227)
In parallel_function>distributed_execution (line 742)
In parallel_function (line 574)
In MolecularNetworking_LongBINOMIAL_AverageFOR_4PAR (line 15)
Parallel pool using the 'Processes' profile is shutting down.
Error using parallel.internal.parfor.ParforEngine/rebuildParforController (line 134)
The parallel pool that parfor was using has shut down. To start a new parallel pool, run your parfor code again or use parpool.
Error in parallel.internal.parfor.ParforEngine/handleIntervalErrorResult (line 304)
obj.rebuildParforController();
Error in parallel.internal.parfor.ParforEngine/getCompleteIntervals (line 227)
[r, err] = obj.handleIntervalErrorResult(r);
Error in MolecularNetworking_LongBINOMIAL_AverageFOR_4PAR (line 15)
parfor binomialaverageloop=1:NumberofBINOMIALTrials
Caused by:
Error using parallel.internal.parfor.ParforEngine/buildParforController (line 93)
No running parallel pool. To start a new parallel pool use parpool.
I have tried it several times, always the same error is encountered. What should I do?
Somewhere, some one suggested to increase memory per core. I am using windows and do not know how to do that.
  2 个评论
Edric Ellis
Edric Ellis 2024-12-3
There are several possibilities as to why your workers are aborting during parfor. Running low on memory is one possibility - try opening a parallel pool with a lower number of workers before running your code, for example parpool("Processes",2). If you have crash dump files (you'll see a warning when a pool starts up telling you about these), then there might be some other problem. In any case, I would recommend contacting MathWorks support who can work through some more diagnostic steps to help get to the bottom of this.
Ankit Gaur
Ankit Gaur 2024-12-3
编辑:Ankit Gaur 2024-12-3
Even parfor with one worker does not work. For example I tried running the same code with
parpool("Processes",1)
But still got error
Starting parallel pool (parpool) using the 'Processes' profile ...
Connected to parallel pool with 1 workers.
ans =
ProcessPool with properties:
Connected: true
NumWorkers: 1
Busy: false
Cluster: Processes (Local Cluster)
AttachedFiles: {}
AutoAddClientPath: true
FileStore: [1x1 parallel.FileStore]
ValueStore: [1x1 parallel.ValueStore]
IdleTimeout: 30 minutes (30 minutes remaining)
SpmdEnabled: true
Warning: A worker aborted during execution of the parfor loop. The parfor loop will now run again on the remaining workers.
> In parallel.internal.parfor/ParforEngine/handleIntervalErrorResult (line 285)
In parallel.internal.parfor/ParforEngine/getCompleteIntervals (line 227)
In parallel_function>distributed_execution (line 742)
In parallel_function (line 574)
In MolecularNetworking_LongBINOMIAL_AverageFOR_4PAR (line 15)
Error using parallel.internal.parfor.ParforEngine/rebuildParforController (line 134)
The parallel pool that parfor was using has shut down. To start a new parallel pool, run your parfor code again or use parpool.
Error in parallel.internal.parfor.ParforEngine/handleIntervalErrorResult (line 304)
obj.rebuildParforController();
Error in parallel.internal.parfor.ParforEngine/getCompleteIntervals (line 227)
[r, err] = obj.handleIntervalErrorResult(r);
Error in MolecularNetworking_LongBINOMIAL_AverageFOR_4PAR (line 15)
parfor binomialaverageloop=1:NumberofBINOMIALTrials
Caused by:
Error using parallel.internal.parfor.ParforEngine/buildParforController (line 93)
No running parallel pool. To start a new parallel pool use parpool.
Parallel pool using the 'Processes' profile is shutting down.
This parallel pool has been shut down.
Caused by:
The client lost connection to worker 1. This might be due to network problems, or the interactive communicating job might have
errored.

请先登录,再进行评论。

回答(1 个)

Kautuk Raj
Kautuk Raj 2024-12-31
编辑:Kautuk Raj 2024-12-31
It seems like you are encountering an issue related to the parallel pool shutting down unexpectedly during the execution of your parfor loop. This is a known issue, and you might find relevant information in this bug report from MathWorks: https://www.mathworks.com/support/bugreports/3187300
Specifically, it is suggested to increase the value of the MW_PCT_TRANSPORT_HEARTBEAT_INTERVAL environment variable to a large integer value, such as 100000, before opening your MATLAB session.

类别

Help CenterFile Exchange 中查找有关 Parallel for-Loops (parfor) 的更多信息

产品


版本

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by