How to use All cores

189 次查看(过去 30 天)
christian matira
christian matira 2021-9-16
I cannot use all cores in the server? I requested exclusive access to this server installed the license as an individual. and I am running it via cli
./matlab -nodisplay
>> ver
-----------------------------------------------------------------------------------------------------
MATLAB Version: 9.6.0.1472908 (R2019a) Update 9
MATLAB License Number: XXXXXX
Operating System: Linux 3.10.0-1127.18.2.el7.x86_64 #1 SMP Sun Jul 26 15:27:06 UTC 2020 x86_64
Java Version: Java 1.8.0_181-b13 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
-----------------------------------------------------------------------------------------------------
MATLAB Version 9.6 (R2019a)
Simulink Version 9.3 (R2019a)
Control System Toolbox Version 10.6 (R2019a)
DSP System Toolbox Version 9.8 (R2019a)
Deep Learning Toolbox Version 12.1 (R2019a)
Image Processing Toolbox Version 10.4 (R2019a)
Instrument Control Toolbox Version 4.0 (R2019a)
Optimization Toolbox Version 8.3 (R2019a)
Parallel Computing Toolbox Version 7.0 (R2019a)
Signal Processing Toolbox Version 8.2 (R2019a)
Simulink Control Design Version 5.3 (R2019a)
Statistics and Machine Learning Toolbox Version 11.5 (R2019a)
Symbolic Math Toolbox Version 8.3 (R2019a)
>> feature('numcores')
MATLAB detected: 44 physical cores.
MATLAB detected: 88 logical cores.
MATLAB was assigned: 8 logical cores by the OS.
MATLAB is using: 8 logical cores.
MATLAB is not using all logical cores because hyper-threading is enabled.
MATLAB is not using all logical cores because Operating System restricted the number of cores to: 8.
How do I force matlab to use all of the physical cores in the system?
I tried parpool but it is always limited by the number or workers
>> parpool('local',44)
Starting parallel pool (parpool) using the 'local' profile ...
Error using parpool (line 113)
You requested a minimum of 44 workers, but the cluster "local" has the NumWorkers property set to allow a maximum of 8 workers. To run a communicating job on more workers than this (up to a maximum of 512 for
the Local cluster), increase the value of the NumWorkers property for the cluster. The default value of NumWorkers for a Local cluster is the number of cores on the local machine.
Thanks

回答(1 个)

Raymond Norris
Raymond Norris 2021-9-16
To increase the number of permissible local workers, you could run
local = parcluster('local');
local.NumWorkers = 44;
pool = local.parpool(44);
But that's not going to solve your issue. You've now oversubscribed the allotted cores given to MATLAB by the OS. My guess is that you're either running cgroups or through a scheduler (e.g. PBS) that is allocating 8 cores to your job. You need to increase the allocated cores at the OS/scheduler level. Then MATLAB will be given more cores, and then you can safely run more workers (without the need to call local.NumWorkers).
  2 个评论
christian matira
christian matira 2021-9-16
Your solution worked! Thanks. In my case, oversubscription is not a problem since the node is exclusively allocated to me for the meantime.
I also tested running it through a scheduler in our case its slurm here is a sample script for others' reference it properly sees the worker processes
#!/bin/bash
#SBATCH --partition=batch
#SBATCH --qos=240c_batch
#SBATCH --job-name="test"
#SBATCH --output=output.log
#SBATCH --requeue
#SBATCH -w cpu-05
#SBATCH --error=error.log
#SBATCH -n 44
#SBATCH --reservation=matlab
cd <matlab_bin>
./matlab -nodisplay -r "run('/home/<user>/test_script.m');exit;"
Thanks again
Raymond Norris
Raymond Norris 2021-9-16
I have two small suggestions.
Rather than hardcoding the pool size to be 44, use the Slurm environment variable SLURM_NPROCS (since you're setting the -n switch).
local = parcluster('local');
sz = str2num(getenv('SLURM_NPROCS'));
if isempty(sz)
% Not running in a Slurm job, so default to 44
sz = 44;
end
local.NumWorkers = sz;
pool = local.parpool(sz);
This gives you the flexibility to change it in the Slurm script without also needing to modify your MATLAB code.
Second, change the following
cd <matlab_bin>
./matlab -nodisplay -r "run('/home/<user>/test_script.m');exit;"
to
module load matlab
matlab -batch run('/home/<user>/test_script.m')
This assumes you have a module system. If you're running Slurm, you most likely have module. You may need to modify the "matlab" string (e.g. "math/matlab/R2021a"). Run module avail to see what module packages are available.
The -batch switch allows you to remove -nodisplay as well as not needing to call exit.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Third-Party Cluster Configuration 的更多信息

产品


版本

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by