codes runs too long

1 次查看(过去 30 天)
zamri
zamri 2013-1-5
Hi, i wonder if there is somebody who can help. I have this program where it calculates pressure inside a cavity. For normal case, the program just took 30 minutes to give result. I did modification to the program where the concerned codes a shown below, could this codes are the culprit where it took 15 hrs to run and give result. a, b, c, d are bessel functions that i dont bother to write here. Note: this is just partial of the program, i doubt other part is the reason because i only change numeric values for them.
HHvp=1e-10;
HHvp1=1e-10;
HHvp2=1e-10;
HHvp3=1e-10;
HHvp4=1e-10;
HHvp5=1e-10;
for m1=1:5
for n1=1:5
for l1=1:5
omega_n(m1,n1,l1)=m1*n1*l1 % a lot more to this, to long to write)
end
end
end
for n1=1:5
for l1=1:5
HHvp1= HHvp1 + a*n1*l1
end
end
for n1=1:5
for l1=1:5
HHvp2= HHvp2 + b*n1*l1
end
end
for n1=1:5
for l1=1:5
HHvp3= HHvp3 + c*n1*l1
end
end
for n1=1:5
for l1=1:5
HHvp4= HHvp4 + d*n1*l1
end
end
HHvp=HHvp1+HHvp2+HHvp3+HHvp4;
Hvp(mmm2,nnn2,lll2,num2)= HHvp;
[EDITED, Jan, code formatted]
  4 个评论
Azzi Abdelmalek
Azzi Abdelmalek 2013-1-5
Simon, I did'nt see the comment, maybe because the question was not well formated. But what is the aim to post a part of code that took 0.02 s?
Jan
Jan 2013-1-6
@Azzi: I hope that the OP will explain this.

请先登录,再进行评论。

采纳的回答

Jan
Jan 2013-1-6
编辑:Jan 2013-1-6
In reply to your posted code:
When you use the profile to find the most time consuming lines, a standard procedure gets obvious: Avoid repeated calculations inside a loop. Here the Bessel function waste a lot of time and using temporary variables accelerate the code by a factor > 100:
r=1e-10+(lll2-1)*dr; % After this line insert this:
tmp1 = 7.7*besselj(0,48.3*r)-0.32*bessely(0,48.3*r);
tmp2 = -22*besselj(0,96.7*r)-18.2*bessely(0,96.7*r);
tmp3 = -3*besselj(0,145*r)-17.5*bessely(0,145*r);
tmp4 = -0.8*besselj(0,193*r)-1.6*bessely(0,193*r);
tmp5 = -1.8*besselj(0,242*r)-0.37*bessely(0,242*r);
% And now replace the correspondig calculations by "tmpX" using the
% Replace dialog of the editor.
Btw., the code is extremely ugly. A substantial cleanup would at first increase the speed and at second allow a successful debugging. Most of all consider the MLint warnings (see the orange bars in the Editor). Then add comments and split the very large lines into readable pieces. Then it will be much easier to identify the parts, which are calculated repeatedly. Finally I guess, that a speedup of factor 1000 is possible without any magic or sophisticated tricks.
  • Is sqrt(m1*pi/(R2-R1))^2 really wanted?
  • In the calculation of Hvp_0_1_0 you assume that n1 has the last value from the last FOR loop, which uses n1 as counter. Better define n1 and l1 explicitly outside the loops.
  • Is "3.142" a crude approximation of pi? If so, don't do it. Never loose accuracy, when there is no benefit.
  3 个评论
zamri
zamri 2013-1-6
dear Jan,
thanks so much for your advise, i was able to run it in 5 minutes. I appreciate it so much. Sorry for being like 5 yr old on this.
Jan
Jan 2013-1-6
编辑:Jan 2013-1-6
I've replaced some constants in the loops and abbreviated whatever I found logical. Of course I cannot debug the results, so I leave this up to you. Some details look strange, like the mentioned sqrt(x)^2.
With less redundancies, the code runs in 100 seconds. Perhaps the code is even less readable as before, sorry. When this code is used for serious science, I would suggest to rewrite it from scratch after mathematical simplifications.
clc
%Torus Dimension
R2=0.275;
R1=0.21;
W=0.205;
ht=0.015;
Theta=2*pi;
%Young's Modulus
E=7.5E+7;
% Poisson's Ratio
GAMA=0.40;
% Plate density
Lo=1350;
Lo0=1.3;
%Plate loss factor
Eta1=0.03;
Eta2=0.03;
c=343;
%excitation force
%F=F0*sin(2*pi*f);
F0=1;
% Excitation force location
thetaf=0;
zf=W/2;
%Plate longitudinal wave velocity
cLt=sqrt(E/Lo/(1-GAMA^2));
% Plate Bending Stiffness
B=E*ht^3/(12*(1-GAMA^2));
dr0=(R2-R1)/20
dtheta0=2*pi/20;
dz0=W/20;
df=100/100;
dr=(R2-R1)/2;
dtheta=2*pi/3;
dz=W/2;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
HFv=zeros(3,3,51);
HFvv=zeros(3,3,51);
Hvp=zeros(3,3,3,51);
CHvp=zeros(3,3,3,51);
omega_mn1=zeros(5,5);
omega_mn=zeros(5,5);
p_omega=zeros(3,3,3,51);
mean_p2=zeros(3,3,3,51);
omega_n=zeros(5,5,5);
Freq=zeros(51,1);
tmp6 = (1-GAMA^2);
h = waitbar(0,'Please wait...');
for num2 = 200:250
disp(num2);
f=1e-10+(num2-1)*df;
omega=2*3.1415928*f;
omega2 = omega ^ 2;
Freq(num2,1)=f;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
coef_Hvp=0.0005i*(Theta/2)*(W/2)*omega*c^2*Lo0;
coef_HFv=2i*omega*F0/(Lo*ht*pi*W);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
coef_1_0_0=0.105i*Theta*W*omega*c^2*Lo0;
coef_0_1_0=0.041i*(Theta/2)*W*omega*c^2*Lo0;
coef_0_0_1=0.0002i*Theta*(W/2)*omega*c^2*Lo0;
coef_1_1_0=2.3i*(Theta/2)*W*omega*c^2*Lo0;
coef_0_1_1=0.0002i*(Theta/2)*(W/2)*omega*c^2*Lo0;
coef_1_0_1=0.0006i*Theta*(W/2)*omega*c^2*Lo0;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
EEc=1e-10;
EEp=1e-10;
for mmm2=1:3
theta=1e-10+(mmm2-1)*dtheta;
for nnn2=1:3
z=1e-10+(nnn2-1)*dz;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
HFFVV=1e-10;
gama=W*sqrt(ht)/((R2)^(3/2));
q2 = (1/sqrt(tmp6*Lo/E));
for m=1:5
q1 = sin(m*pi*z/W);
q3 = sin(m*pi*zf/W);
for n=1:5
omega_mn1(mmm2,nnn2)=sqrt(tmp6*((1+(m*3.142*R2/(n*W))^2)^-2)*(m*3.142/(n*gama)^4)+ ...
(1+(m*3.142*R2/(n*W))^2)^2*(n^4)/12);
omega_mn(mmm2,nnn2)=omega_mn1(mmm2,nnn2)*(ht)/R2^2*q2;
HFFVV=HFFVV+q3*sin(n*pi*thetaf/Theta)*q1*sin(n*pi*theta/Theta)/ ...
(omega_mn(m,n)^2-omega2+1i*Eta1*omega_mn(m,n)^2);
end
end
HFvv(mmm2,nnn2,num2)=coef_HFv*HFFVV;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
q3 = (ht)/R2^2*(1/sqrt((tmp6)*Lo/E));
gama=W*sqrt(ht)/((R2)^(3/2));
q5 = sin((1:5)*pi*thetaf/Theta) .* sin((1:5)*pi*theta0/Theta);
for lll2=1:3
r=1e-10+(lll2-1)*dr;
tmp1 = 7.7*besselj(0,48.3*r)-0.32*bessely(0,48.3*r);
tmp2 = -22*besselj(0,96.7*r)-18.2*bessely(0,96.7*r);
tmp3 = -3*besselj(0,145*r)-17.5*bessely(0,145*r);
tmp4 = -0.8*besselj(0,193*r)-1.6*bessely(0,193*r);
tmp5 = -1.8*besselj(0,242*r)-0.37*bessely(0,242*r);
pw=1e-10;
for nn= 1:21
theta0=1e-10+(nn-1)*dtheta0;
for ll=1:21
z0=1e-10+(ll-1)*dz0;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
HHFv=1e-10;
q12 = (1:5) .^ 4;
q13 = ((1:5) * gama) .^ 4;
for m=1:5
q1 = sin(m*pi*zf/W);
q2 = q1 * sin(m*pi*z0/W);
q11 = 1 / q6^2;
for n=1:5
q6 = 1+(m*3.142*R2/(n*W))^2;
omega_mn1(mmm2,nnn2) = sqrt(tmp6 * q11 * ...
(m*3.142/q13(n)) + q6^2*q12(n)/12);
omega_mn(mmm2,nnn2) = omega_mn1(mmm2,nnn2)*q3;
q4 = omega_mn(m,n)^2;
HHFv=HHFv + q2 * q5(n) / (q4-omega2+1i*Eta1*q4);
end
end
HFv(mmm2,nnn2,num2)=coef_HFv*HHFv;
HHvp=1e-10;
HHvp1=1e-10;
HHvp2=1e-10;
HHvp3=1e-10;
HHvp4=1e-10;
HHvp5=1e-10;
q7 = ((1:5) * pi / W) .^ 2;
for m1=1:5
q1 = abs(m1*pi/(R2-R1));
for n1=1:5
q2 = c * q1 + (2*n1/(R2+R1))^2;
for l1=1:5
omega_n(m1,n1,l1) = q2 + q7(l1);
end
end
end
omega_n2 = omega_n .^ 2;
omega_x = omega_n2 - omega2+1i*Eta2*omega_n2;
q7 = cos((1:5)*pi*z0/W) .* cos((1:5)*pi*zf/W);
q8 = (2/(R2+R1))*theta0;
q9 = (2/(R2+R1))*thetaf;
for n1=1:5
q1 = cos(n1*q8)* cos(n1*q9);
for l1=1:5
q10 = q1 * q7(l1) / omega_x(m1,n1,l1);
HHvp1 = HHvp1 + tmp1 * q10;
HHvp2 = HHvp2 + tmp2 * q10;
HHvp3 = HHvp3 + tmp3 * q10;
HHvp4 = HHvp4 + tmp4 * q10;
HHvp5 = HHvp5 + tmp5 * q10;
end
end
HHvp=HHvp1+HHvp2+HHvp3+HHvp4+HHvp5;
Hvp(mmm2,nnn2,lll2,num2)= HHvp;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
n1_ = 5;
l1_ = 5;
Hvp_1_0_0=tmp5/((c^2*(pi/(R2-R1))^2*(1+1i*Eta2))-omega2);
Hvp_0_1_0=(cos(n1_*q8)*cos(n1_*q9))/((c^2*(2/(R2+R1))^2*(1+1i*Eta2))-omega2);
Hvp_0_0_1=(cos(l1_*pi*z0/W)*cos(l1_*pi*zf/W))/((c^2*(pi/W)^2*(1+1i*Eta2))-omega2);
Hvp_1_1_0=((tmp1)*(cos(n1_*q8)*cos(n1_*q9)))/((c^2*((pi/(R2-R1))^2+(2/(R2+R1))^2)*(1+1i*Eta2))-omega2);
Hvp_0_1_1=(cos(n1_*q8)*cos(n1_*q9)*((cos(l1_*pi*z0/W)*cos(l1_*pi*zf/W))))/((c^2*((2/(R2+R1))^2+(pi/W)^2)*(1+1i*Eta2))-omega2);
Hvp_1_0_1=((tmp1)*(cos(l1_*pi*z0/W)*cos(l1_*pi*zf/W)))/((c^2*((pi/(R2-R1))^2+(pi/W)^2)*(1+1i*Eta2))-omega2);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
CHvp(mmm2,nnn2,lll2,num2)=coef_Hvp*HHvp+coef_1_0_0*Hvp_1_0_0+coef_0_1_0*Hvp_0_1_0+coef_0_0_1*Hvp_0_0_1+coef_1_1_0*Hvp_1_1_0+coef_0_1_1*Hvp_0_1_1+coef_1_0_1*Hvp_1_0_1;
IntFunction=HFv(mmm2,nnn2,num2)*CHvp(mmm2,nnn2,lll2,num2);
pw=pw+dtheta0*dz0*IntFunction;
end
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
p_omega(mmm2,nnn2,lll2,num2)=pw;
mean_p2(mmm2,nnn2,lll2,num2)=abs(p_omega(mmm2,nnn2,lll2,num2)*conj(p_omega(mmm2,nnn2,lll2,num2))/2);
end
end
end
waitbar(num2/250,h)
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
m_p2_CLT1_1_1=zeros(51,1);
m_p2_CLT1_2_1=zeros(51,1);
m_p2_CLT1_3_1=zeros(51,1);
m_p2_CLT2_1_1=zeros(51,1);
m_p2_CLT2_2_1=zeros(51,1);
m_p2_CLT2_3_1=zeros(51,1);
m_p2_CLT3_1_1=zeros(51,1);
m_p2_CLT3_2_1=zeros(51,1);
m_p2_CLT3_3_1=zeros(51,1);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
for m2=200:250
m_p2_CLT1_1_1(m2,1)=10*log10(mean_p2(1,1,1,m2));
m_p2_CLT1_2_1(m2,1)=10*log10(mean_p2(1,2,1,m2));
m_p2_CLT1_3_1(m2,1)=10*log10(mean_p2(1,3,1,m2));
m_p2_CLT2_1_1(m2,1)=10*log10(mean_p2(2,1,1,m2));
m_p2_CLT2_2_1(m2,1)=10*log10(mean_p2(2,2,1,m2));
m_p2_CLT2_3_1(m2,1)=10*log10(mean_p2(2,3,1,m2));
m_p2_CLT3_1_1(m2,1)=10*log10(mean_p2(3,1,1,m2));
m_p2_CLT3_2_1(m2,1)=10*log10(mean_p2(3,2,1,m2));
m_p2_CLT3_3_1(m2,1)=10*log10(mean_p2(3,3,1,m2));
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
figure(1)
plot(Freq(:,1), m_p2_CLT1_1_1(:,1), Freq(:,1), m_p2_CLT1_2_1(:,1), Freq(:,1), m_p2_CLT1_3_1(:,1), 'LineWidth',2);
grid on
figure(2)
plot(Freq(:,1), m_p2_CLT2_1_1(:,1), Freq(:,1), m_p2_CLT2_2_1(:,1), Freq(:,1), m_p2_CLT2_3_1(:,1), 'LineWidth',2);
grid on
figure(3)
plot(Freq(:,1), m_p2_CLT3_1_1(:,1), Freq(:,1), m_p2_CLT3_2_1(:,1), Freq(:,1), m_p2_CLT3_3_1(:,1), 'LineWidth',2);
grid on

请先登录,再进行评论。

更多回答(1 个)

Jan
Jan 2013-1-5
编辑:Jan 2013-1-5
Look at this piece of code:
for n1=1:5
for l1=1:5
HHvp4= HHvp4 + d*n1*l1
end
end
This can be abbreviated to:
HHvp4 = HHvp4 + 225 * d;
A pre-allocation is strongly recommended, when you fill an array like omega_n iteratively. But your example runs in 0.000032 seconds on my computer, such that I do not think that this is a bottleneck - except if you call this millions of times, and if so, please explain this. Anyhow, I would create this array like this:
omega_n = zeros(5, 5, 5);
omega_n(:, :, 1) = (1:5)' * (1:5);
for ii = 2:5
omega_n(:, :, ii) = ii * omega_n(:, :, 1);
end
Puh, looks more like Matlab, but it takes 0.000020 also. I do not assume that 0.01 thousandth seconds have been worth to improve this part. But who cares about minutes of programming time, when the execution time is 0.1 seconds shorter?! This is the fastest method I've found under R2009a/64 on a Windows7/Core2Duo:
omega_n = zeros(5, 5, 5);
for l1=1:5
for n1=1:5
for m1 = 1:5
omega_n(m1, n1, l1) = m1 * n1 * l1;
end
end
end
0.0000089 seconds. The order of loops has been reversed, such that neighboring elements are written. But as long as 125*8 bytes match into the processor cache, this is not essential. Using a temporary variable c = n1 * l1 in the inner loop is more efficient in theory, but the improvement is not significant.
  4 个评论
Azzi Abdelmalek
Azzi Abdelmalek 2013-1-5
Zamri, there are some missing data, which allows to test your code (like F0). Also there are parts of your code which are not well formated.
Jan
Jan 2013-1-6
@Zamri: I've formatted your code again. Please follow the linked instructions. Thanks.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Mathematics 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by