Replace nested loops?
3 次查看(过去 30 天)
显示 更早的评论
Is it possible to replace 4 for loops in the form: for if for for if for if .... ....
with something that is more efficient?
because some of the data i run have millions of variables, and i have lots of data set to run, it takes a few days to finish them. So anything that would lower the run time of this section would be geatly appreciated. Thanks
[EDITED: Code yopied form the comments, Jan Simon]
for i=1:length(starts)
counter = 0;
if isempty(starts{i}) == 0
for j = 1: length(starts{i})
for k = 1: length(starts)
if isempty(starts{k}) ==0
for m = 1:length (starts{k})
if stops{i}(j) >= starts{k}(m) && stops{i}(j)< stops{k}(m) && isempty(peak_loc3{k})==0 && peak_loc3{i}(j)~= peak_loc3{k}(m)
counter = counter +1;
overlap{1,i}(counter) = peak_loc3{k}(m);
overlap{2,i}(counter) = peak_loc3{i}(j);
end
end
end
end
end
end
end
1 个评论
Sean de Wolski
2011-12-7
We really need to see the operations to figure out if it's possible. A well orchestrated for-loop should be fairly fast in newer versions.
采纳的回答
Sven
2011-12-7
A small time-drain will be the fact that inside the loop, the overlap variable gets constantly resized. In the MATLAB editor, these variables will have a little orange line under them. If you hover over that line, it will warn you about this potential problem.
Here's a first attempt that will reduce the time needed. Note I've also replaced "isempty(x)==0" with "~isempty(x)" (for simplicity) and replaced some of the nested if statements with continue statements, just to have less nesting (which can get confusing).
overlap = cell(2, length(starts));
nonEmptyStarts = find(~cellfun(@isempty,starts));
for i=nonEmptyStarts
counter = 0;
thisStart = starts{i};
thisStop = stops{i};
for j = 1: length(thisStart)
for k = nonEmptyStarts
if isempty(peak_loc3{k}), continue; end
thatStart = starts{k};
thatStop = stops{k};
thisMask = thisStop(j)>=thatStart & thisStop(j)<thatStop & peak_loc3{i}(j)~=peak_loc3{k}(1:length(thatStart))';
for m = find(thisMask);
counter = counter +1;
overlap{1,i}(counter) = peak_loc3{k}(m);
overlap{2,i}(counter) = peak_loc3{i}(j);
end
end
end
end
Unfortunately there is still a big culprit of "variable size adjustment" sitting inside a loop, which will really slow down the code. If you see the line starting with overlap{1,i}(counter) =, you'll notice that every time this line is run, the variable sitting in the cell at overlap{1,i} grows by one. If this happens a lot, MATLAB has to work really hard to find new space in memory fitting this new size.
This updated code currently has an approximately 10-fold reduction in running time to the original.
UPDATE
overlap = cell(2, length(starts));
nonEmptyStarts = find(~cellfun(@isempty,starts));
for i=nonEmptyStarts
counter = 0;
% Get column vectors of the first start/stop pairs
startA = starts{i}'; stopA = stops{i}';
for k = nonEmptyStarts
% Get row vectors of the second start/stop pairs
startB = starts{k}; stopB = stops{k};
% Get a mask of all A-B pairs that match requirements
ABMask = bsxfun(@ge,stopA,startB) & ...
bsxfun(@lt,stopA,stopB) & ...
bsxfun(@ne,peak_loc3{i}(1:numel(startA)), peak_loc3{k}(1:numel(startB))');
[j,m] = find(ABMask);
numToAdd = length(m);
if ~numToAdd, continue; end
% Append them to "overlap"
indsToInsert = (1:numToAdd) + counter;
counter = counter + numToAdd;
overlap{1,i}(indsToInsert) = peak_loc3{k}(m);
overlap{2,i}(indsToInsert) = peak_loc3{i}(j);
end
end
This update should make significant improvements on a large dataset. There is still room for improvement, depending on the type and sizes of data you have. You can actually get a good view of what parts of the code take the most time by replacing tic and toc with profile on and profile viewer.
I have a feeling that the assignment into overlap will still be the biggest area for possible improvement.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Whos 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!