Nested Loops and efficiency

Question

msh 2014-6-21

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/136884-nested-loops-and-efficiency

回答： msh 2014-6-27

Hi,

I have the following function, and there are some nested loops with multiple if statements inside. The code, runs extremely slow in Matlab ( fortran executes the whole code in 15 min, matlab needs a couple of hours). Nonetheless, this is one part of the code, that Matlab needs some time to complete

function [ystar, cstar, plterm]=decision_rules(r,w, brolillas, model, pl_flag)
global nshock one update tinies aproxdr e a 
du=zeros(length(a)+1);
errrel = 3.0e-30;
rules_converge =false;
iter=0;
%*************************************************************
% First Get last term
%*************************************************************
if(iter>1)
iterflag=1;
end 
plterm=0.0;
% reset plterm
if(brolillas<1)
plterm0=0.0d0;
end 
if(model==2  && pl_flag >= 1 && brolillas >= 1)
    [plterm]=exterm(fdist, w, r, ystar) ;
    if(abs(plterm-plterm0)<1.0d-6)
        update_flag=1;
    end
    if(abs(plbnd(1)-plbnd(2))<1.0d-9)
        update_flag=1;
    end
    if(plterm>=plterm0)
        plbnd(1)=max(plterm0,plbnd(1));
    end
    if(plterm<=plterm0)
        plbnd(2)=min(plterm0,plbnd(2));
    end
    if(brolillas >= 3)
        plterm=update_pl.*plterm+(1.0d0-update_pl).*plterm0;
        if(plterm>plbnd(2) || plterm<plbnd(1)) 
            plterm=sum(sum(plbnd))./2.;
        end
    else
        plterm=update_pl.*plterm+(1.0d0-update_pl).*plterm0;
    end;
elseif(model==2 && brolillas==0) 
    plbnd(1)=-1.;
    plbnd(2)=1.;
    plterm=plterm0;
end
pl_flag=pl_flag+1;
brolillas=brolillas+1;
% Initialization 2: construct cstar from ystar
y0=zeros(length(a),nshock); 
c0=zeros(length(a),nshock);
ystar=zeros(length(a),nshock); 
cstar=zeros(length(a),nshock); 
for i=1:nshock
    for n=1: length(a)
        cstar(n,i)=r.*a(n)+w.*e(i)-ystar(n,i);
    end
end
% Initialization 3 : initial updating
y0 =(one-update).*y0+update.*ystar;
c0 =(one-update).*c0+update.*cstar;
%****************************************************************
% Main Loop
%****************************************************************
while(~rules_converge)
    iter = iter + 1;
  disp(['Iteration No:' num2str(iter)])
%*************************************************************
% Solve the decision rule
%*************************************************************
for i=1:nshock
    for n=1:length(a)
        %Find ystar 
        for np=1:length(a)+1
              if(np>length(a))
                  break;
              end
              if(upap(n,a(np),i,r,w) > 100/tinies)
                  break;
              end
              du(np)=-upap(n,a(np),i,r,w)+ meru(a(np),i,r,c0)+plterm;
              if(du(np)<=0)
                  break;
              end
          end
          if(np>length(a))
              yy1=a(length(a));
          elseif(upap(n,a(np),i,r,w) > 100/tinies) 
              if(np==1)
                  yy1=a(1);
              else
                  yexp=a(n)*r+w*e(i) - tinies;
                  dusup=upap(n,yexp,i,r,w)-meru(yexp,i,r,c0)-plterm;
                  if((yexp-a(np-1))>tinies && dusup>0.)
                      yy1=zbrent(a(np-1),yexp,errrel,i,n,r,w,c0,plterm);
                  else
                      yy1=a(np-1);
                  end
              end
          else
              if(np==1)
                  yy1=a(1);
              else
                  yy1=zbrent(a(np-1),a(np),errrel,i,n,r,w,c0,plterm);
              end
          end
          ystar(n,i) = yy1;
          cstar(n,i) = r.*a(n)+w.*e(i)-ystar(n,i);
      end
  end
%*************************************************************
% Check convergence
%*************************************************************
no_converge_y=zeros(length(a),nshock);
no_converge_y(abs(ystar-y0)>aproxdr)=1;
no_converge =fix(sum(sum(no_converge_y)));  
tempmaxval=abs(ystar-y0);
error=max(tempmaxval(:));
disp(['the error in optimal plans is:' num2str(error)])
if(no_converge==0)
   rules_converge=true;
else
    rules_converge=false;
end
%*************************************************************
% Update y0 and c0
%*************************************************************
y0=ystar;
c0=cstar;
end
   disp(['Total Iteration No:' num2str(iter)])
end

Can anybody here suggest some modifications that I could improve efficiency ?

Thanks

4 个评论
显示 2更早的评论隐藏 2更早的评论

dpb 2014-6-22

编辑：dpb 2014-6-23

在 MATLAB Online 中打开

See

doc profile

for running the profiler. It's not the overall time taken, it's where within the code precisely the largest fraction of time is taken up that's important so as to know precisely where in the overall function one should concentrate efforts.

The difficulty of this code in Matlab to generate speed while guaranteeing the same result is that the loops all have all the break conditions in them and the later results are dependent upon the point within the loop at which the exit occurred. A prime example is that condition on the first loop end index Jan and I were just discussing. The next step is dependent on where that loop exits and why.

To get best performance in Matlab one needs to be able to vectorize the loops but it would take a significant amount of detailed investigation of the algorithm to be able to tell what, specifically, would be the proper action after the vectorization of one of these loops that otherwise would have exited before completion. That level of effort is just too much to expect for a respondent here to be able to infer just how that should look.

There's the snippit section

if(upap(n,a(np),i,r,w) > 100/tinies), break; end
du(np)=-upap(n,a(np),i,r,w)+ meru(a(np),i,r,c0)+plterm;
if(du(np)<=0), break; end

One could easily enough write the function upap to take the vectors a and n and return a logical array or the first index of the condition but one might then find that doing the whole thing is as or more expensive than shortening the loop because the condition occurs early on in a high percentage of typical cases while the function could be relatively costly. As Jan points out, whether those functions are significant computation-wise is one of the things the profiler will tell you.

If one had a functional descripton of the inputs/expected outputs, one could possibly write a Matlab function that does that that does use typical Matlab constructs effectively, but to start from translating Fortran to Matlab for such a case is very difficult task other than the literal transliteration process you've got here and the resulting inefficiency that comes with that.

Jan 2014-6-22

编辑：Jan 2014-6-22

@Safis: To profile the code use the command profiler. Lokk in the documentation to learn more about it.

This function will reveal, if the loops are the main problem, of if the time is spent in meru and upap. Without profiling it is possible, that we optimize the nested loops, which need only some percent of the total computing time.

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Jan 2014-6-22

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/136884-nested-loops-and-efficiency#answer_141968

在 MATLAB Online 中打开

The code can be improved. E.g. this is not useful:

for np=1:length(a)+1
  if(np>length(a))
    break;
  end
  ...

What about:

for np=1:length(a)
  ...

This should be vectorized for a fair speed in Matlab:

for i=1:nshock
  for n=1: length(a)
    cstar(n,i)=r.*a(n)+w.*e(i)-ystar(n,i);
  end
end

But I don't assume, that this is a bottleneck of the code.

Nevertheless, questions about improving the speed of the code cannot be answered only by seeing the code, but you need the profiler and the readers of the forum need a running example with matching testdata.

A general but a little bit disappointing advice: Avoid expressions like 0. and 0.0d0 in Matlab.Although they work currently and should work for reasons of compatibility, this is not documented.

5 个评论
显示 3更早的评论隐藏 3更早的评论

dpb 2014-6-22

编辑：dpb 2014-6-22

在 MATLAB Online 中打开

I initially posted the same comment, Jan, but--the +1 on the loop index does do something; now whether it's useful or needed or not I don't know but it can make a difference owing to the test in the following section. If you remove the +1 from the loop limit, then first clause of the following test

if(np>length(a))
  yy1=a(length(a));
elseif(upap(n,a(np),i,r,w) > 100/tinies) 
  ...

could never be true but it can be as written if the other loop exit conditions above don't ever come to pass. At the point of realizing that and not having any other info of purpose to try to recast to end objective instead of translate I suggested just turn it into a mex-function if need to use from Matlab.

I'd think the cautionary about d/D as exponents for floating point literals is excessively conservative--I don't think there's any way TMW could ever remove that from the language w/o it causing such a huge issue in compatibility that I'd certainly not worry over it.

Granted there's no express need for it in Matlab as there is in Fortran as the default casting over everything to double makes the expression superfluous.

dpb 2014-6-22

编辑：dpb 2014-6-22

在 MATLAB Online 中打开

Since the index condition of

np=length(a)+1

causes an immediate break the value of np is defined on exit of the loop. It's not the case the index has "run off the end".

That the behavior of the iterator variable after normal loop completion isn't documented at all (whether it is undefined or retains the value at the end of the loop as Fortran) is another case where the documentation of Matlab syntax is (and always has been) lacking. I've harped on this point of needing a definitive "Standard" (for lack of better nomenclature) for going on 30 yr now.

As a practical programming matter it should be defined and be the value at the last iteration but if it's TMW's desire that it not be relied on then that should definitely be documented as being so. It's this "black hole" of no definition that is a quagmire.

dpb 2014-6-26

I submitted a Service Request on the definition of the for iterator after loop completion being undocumented behavior. The "interpretation" is that it is defined and retains the value it had during at the exit of the loop. The text of the response is included below--

I am writing in reference to your Service Request, Case #01012664 regarding 'For iterator status on completion of loop'.

The iterator is always defined at the end of a loop. Whether a variable is defined or not in any context depends upon the workspace that you are looking at and the workspace that the variable exists in.

In MATLAB, loops run in their parent workspace. So, if a for loop is defined in a script, the for loop utilizes the base workspace (the workspace used by scripts) and the iterator is defined in the base workspace.

If the for loop is running in a function, the for loop utilizes the function's workspace and hence the iterator is also defined in the function workspace.

At the end of the for loop, the iterator will retain the final value that it reached during the execution of the for loop for the last time. [emphasis added as this was our point of uncertainty and the point of the SR being filed]

You are right in your observation that this particular behavior is not clearly documented and it should be. I will go ahead and put in an enhancement request so that the documentation team can consider adding this information to the documentation in a future release of MATLAB.

请先登录，再进行评论。

Answer 2

msh 2014-6-27

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/136884-nested-loops-and-efficiency#answer_142718

profile_results.zip

Sorry for my late reply, but really the whole code took ages to complete so i was trying to do something to get some feeling of the performance. I have attached the file from the results of the whole code and not only of this function. I would be very pleased to get your insights or further guidance.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Nested Loops and efficiency

4 个评论
显示 2更早的评论隐藏 2更早的评论

回答（2 个）

5 个评论
显示 3更早的评论隐藏 3更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

Nested Loops and efficiency

4 个评论 显示 2更早的评论隐藏 2更早的评论

回答（2 个）

5 个评论 显示 3更早的评论隐藏 3更早的评论

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

4 个评论
显示 2更早的评论隐藏 2更早的评论

5 个评论
显示 3更早的评论隐藏 3更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论