Unable to achieve desired speed up using parfor

42 次查看(过去 30 天)
Hi,
I am initializing several instances of a matlab (p)code using parfor loops on two computers with the following configurations.
Comp A: 16 core 3.4GHz, 8GB per core @ 3200MHz,
Comp B: 32 core 3.6GHz, 8GB per core @ 3200MHz,
I am launching 16 instances on A and 32 on B. I find that all instances on B finish in about half the time as those on A. It baffles me since the spec scale almost identically. Also, all instances do the same thing, hence identical computational overhead. Is there any hardware optimization that should be done for better efficiency on A?
  5 个评论
Balachandra Suri
Balachandra Suri 2024-10-31,3:19
Please let me know what other information can be useful? Motherboard config?
Rik
Rik 2024-11-1,7:47
My initial guess was that the generation would be different and hence the number of instructions per cycle may be different. That doesn't seem to be the case here.
Perhaps it is the cache? If everything fits in the CPU cache there is no need to go to RAM. I don't have any other plausible cause, unless the smaller chip doesn't actually reach the frequency you mentioned due to thermal and/or power throttling.

请先登录,再进行评论。

回答(1 个)

Matt J
Matt J 2024-11-1,10:58
I find that all instances on B finish in about half the time as those on A. It baffles me...
That is the expected result, assuming you are running the same loop on both computers. Assuming for example that it is a 32 iteration loop,
parfor i=1:32
...
end
then Comp A would be assigned 2 iterations per core, while Comp B will be assigned only 1. So, it makes perfect sense that Comp B will finish in half the time.

类别

Help CenterFile Exchange 中查找有关 Matrix Indexing 的更多信息

产品


版本

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by