How does the Parallel Computing Toolbox (PCT) add further pressure to memory requirements and how to mitigate memory swapping?

4 次查看(过去 30 天)

On my dual-processor Mac device with 128 GB of RAM, when running "sequentialfs" with MATLAB R2023b in parallel mode on a large dataset (150GB), why does MATLAB swap to disk memory prematurely when there's still RAM available?
The iStatPro shows that MATLAB utilizes about 60 GB of active memory and starts to swap to disk memory. Is there a way to prevent this?

采纳的回答

MathWorks Support Team
MathWorks Support Team 2024-4-5,0:00
The reason for MATLAB swapping prematurely could be attributed to a combination of the high memory footprint required by the Parallel Computing Toolbox (PCT) and how the operating system handles parallelism. In this case, we do not have access to, nor can we test on a dual-processor Mac machine. However, we can try to mitigate the high memory demands imposed by the PCT.
In general, PCT will always add further pressure to memory requirements. For example, consider a standard MATLAB process that uses 2GB of RAM, with a 1GB variable in the workspace, totaling approximately 3GB of memory. A user may then write a PCT algorithm that employs a pool of 12 workers.
  1. The 12 workers will start 12 extra MATLAB processes, each potentially using about 2GB of memory (+ 24GB)
  2. The algorithm may need to broadcast that 1GB variable to each worker MATLAB in the computation ( + 12GB)
  3. During data transfer, we can temporarily have two extra copies of the variable on both the sending and receiving sides. This contributes to a spike in memory usage, not the steady state usage. ( Worst case: +2GB + 12 * 2GB = 26GB)
Adding these all up gives a worst-case peak memory usage of 62GB. This spike in memory usage may cause MATLAB to swap to disk memory temporarily. The swap memory usage indicated by iStatPro may reflect the prior usage due to a spike in memory usage, hence it may imply that MATLAB begins to swap to disk memory when there's still RAM available. 
In summary, each of the workers utilized by PCT will require sufficient memory to start/run a session of MATLAB, have a copy of the data the worker is running on, and accommodate any additional memory demanded by the algorithm.
  1. Lowering the number of workers on the machine will reduce the memory footprint. When using a process-based parallel pool, parpool("local") or parpool("Processes"), the "NumWorkers" controls how many processes are started. We generally advise that the number of workers should be fewer than the number of physical cores. \n
    >> parpool("local");
    >> parpool("Processes");
  2. If memory usage continues to be an issue, you may consider using a thread-based parallel pool by running parpool("threads"); this uses multiple MATLAB computational sessions inside a single process and often has a far lower memory footprint than the former option. 
    >> parpool("threads")
  3. More details about the differences between process-based parallel and thread-based parallel can be found here: 

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Startup and Shutdown 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by