writing to an existing variable 9% slower than clearing + creating it again? Generating new one 10 times as fast!?

3 次查看(过去 30 天)
Hi,
I know preallocation is primarily meant to save overhead and fragmentation which occurs when resizing a variable but I've stumbled upon something strange regarding large variables that i would call an inverse preallocation behavior.
take this example:
clear;
tic; b = zeros(35000); t1 = toc;
tic; b = zeros(35000); t2 = toc;
t1 = 0.0097
t2 = 0.1018
the same happens to gpuArrays. So generating a new variable is significantly faster than writing to an existing one. It applies to all functions or operations I can come up with. Clearing the variable usually takes less time than actually writing to it, however the combined time of clearing and generating a new one on average seems to be exactly how long it takes to write to a preallocated one.
I was wondering if this behavior was normal and/or if there was anything one could to speed things up. Again, it's a factor of 10. Ideally there was a way to always have it the faster way.
For large loops, if I can afford the memory penalty, I now actually go and open several variables instead of overwriting one.
no kidding, this is sometimes significantly faster than re-using the same variable. The code looks awful then, sometimes manually iterating within loops (opening new variables) is faster than just looping over all of it. Same applies to parfor loops and gpuArrays...
In the end one can gain 10% or so by doing the opposite of preallocation, actually clearing the variables one uses in each loop. Example:
clear;
t1 = 0;
t2 = 0;
a = single(rand(16000));
a = gpuArray(a);
for i = 1:200
tic;
b = sinc(a+rand(1)).^2;
clear b;
t1 = t1 + toc;
end
t1
Either for CPU or for GPU the time is always about 7-10% faster if b gets cleared in the loop. Only for small variables, the opposite is true. Please tell me I'm missing something, I don't wanna code like this but saving significant time out of week long simulations ... sure, I'm inclined to do it :-/ I would prefer though that matlab under the hood offered me optimal performance.
my system spec is: Intel 9900k running Windows 10 (all updates) and r2018b, 64GB RAM and an nvidia Titan RTX
regards, Arnold
  5 个评论

请先登录,再进行评论。

回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Loops and Conditional Statements 的更多信息

产品


版本

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by