How to profile calls in a for loop?

2 次查看(过去 30 天)
Carlos
Carlos 2021-1-28
评论: Carlos 2021-1-29
I have a piece of code that does something like:
mf = matfile('newfile','Writable','true');
mf.var01 = zeros(100,100,100);
for ii = 1:100
data = ncread('filename',inputs);
mf.var01(:,:,ii) = myfunction(data);
end
Although as you can see from the code above, I'm preallocating the matrices, etc, the for loop slows down significantly over time. Something is happening with myfunction that is causing this, but I'm having trouble tracking down the problem.
If I use the profiler, I can see the total number of calls made to myfunction, but I don't think there's a way to see how much time each call took, or to profile each individual call in the for loop to undertand why the 100th call was slower than the first.
Any idea of how to debug this? (I am fairly positive this isn't a classic 'you forgot to preallocate matrices' problem).
Thanks.
  8 个评论
Adam Danz
Adam Danz 2021-1-29
The random-data-preallocation was to test whether the speed decrease was related to replacing the 0s in mf.var01 with more complex values. Preallocating it with random values definitely does not help as your data indicate, and I'm sure you understand. The first plot you showed with 0-allocation resulted in durations that grew from 20s to 30s seconds. The second plot you showed with rand-allocation resulted in durations that are all greater than 29 seconds. The rand-allocation wasn't meant to speed things up; it was a diagnostic test.
With 0-preallocation we saw a speed decrease as the number of iterations increased.
With rand-preallocation, the speed varied but did not gradually decrease as the number of iterations increased. Furthermore, it varied close to but a little above the maximum duration when the preallocation was done with 0s toward the end of the loop when the array no longer had many 0s.
This is evidence that the slow-down is due to the increased complexity of variables stored in your array.
If the speed of each iteration was primarily dependent on partial loading with matfile, you'd still see the gradual change.
Carlos
Carlos 2021-1-29
Indeed, that makes sense.
Now the question is, is 30 seconds for this operation reasonable for what we expect from matfile?
Best.

请先登录,再进行评论。

回答(1 个)

Adam Danz
Adam Danz 2021-1-28
编辑:Adam Danz 2021-1-28
I'd start with a simple tic/toc time of each iteration and plot out the results to see if the increase happens at a specific iteration or gradually increases. You can use that data to know when to pause the loop to determine if the 'data' variable or your function is doing something different.
More info on timing loops: see this answer (this demo uses method 3).
mf = matfile('newfile','Writable','true');
mf.var01 = zeros(100,100,100);
tStart = tic(); % added
times = nan(1,100); % added
for ii = 1:100
data = ncread('filename',inputs);
mf.var01(:,:,ii) = myfunction(data);
times(ii) = toc(tStart); %added
end
% added this section
iterationDurations = diff([0, times]);
plot(1:100, iterationDurations)
Keep in mind the first iteration or so may take longer than the next few due to the JIT compiler. If the computations done within the loops are nearly uniform in their complexity, then it's expected that the speed of each loop will increase (become faster) as the number of iterations increase. A decrease in speed suggests that the size of "data" may increase or the operations performed on "data" in "myfunction" may differ.
  3 个评论
Adam Danz
Adam Danz 2021-1-28
编辑:Adam Danz 2021-1-28
The profiler does not return info on single iterations of loops to my knowledge. But before you produce a ton of profiler reports, I'd plot the tic/toc time of each iteration to understand if it's a gradual growth of time or a sudden step that reliably happens at the same iteration.
If there's a sudden jump in time you have a starting point in your investigation and can run the profiler at that iteration.
If it's a gradual change you can also collect the size of "data" variable in each iteration (assuming it changes, I have no idea) and plot that out as a function of tic/toc duration to see if that's the culprit. If the size of that varible doesn't change, then perhaps its content adds to the complexity in your function.
Another possibility is the changing content of mf.var01. Preallocating that varible reduces overhead but that doesn't mean changes to that variable won't affect processing time. As that variable changes from 0s to larger values, it consumes more memory which can affect speed.
If all else fails, you can program the function to produce a profiler report every n-iterations so get ~5 reports that span the first and last iteration.
Carlos
Carlos 2021-1-28
编辑:Carlos 2021-1-29
Thanks, very helpful. The size of "data" doesn't change much (it's the output of a numerical model), and the change is gradual. The iteration time starts growing at about the 40th iteration (of +700) or so. I've "tictoed" the reading of the data and that doesn't seem to be the problem.
The memory footprint doesn't seem to be growing that much, but maybe the system is swapping more to disk over time or something like that.
I'll keep trying. Thanks.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Logical 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by