How to remove the for loop?

w=[1 -0.75 -0.09375 -0.0390625 -0.021972656 -0.014282227 -0.010116577 -0.007587433 -0.005927682 -0.004775077 -0.003939439 -0.00331271 -0.002829606 -0.002448698 -0.00214261 -0.001892639 -0.001685632 -0.001512111 -0.0013651 -0.001239367 -0.001130923 -0.001036679 -0.000954216 -0.000881613 -0.000817328 -0.000760115 -0.000708954 -0.000663003 -0.000621565 -0.000584057 -0.000549987 -0.000518939];
y=[0.841470985 0.963558185 0.999573603 0.946300088 0.808496404 0.598472144 0.33498815 0.041580662 -0.255541102 -0.529836141 -0.756802495 -0.916165937 -0.993691004 -0.982452613 -0.883454656 -0.705540326 -0.464602179 -0.182162504 0.116549205 0.404849921 0.656986599 0.850436621 0.967919672 0.998941342 0.940730557 0.798487113 0.584917193 0.319098362 0.024775425 -0.271760626 -0.544021111 -0.76768581]';
t=0:0.1:pi;
dy=zeros(32,1); %Initialization of dy
for i=2:length(t)
dy(i)= mtimes(w(1:i),y(i:-1:1))
end
%Expected value of dy is as follows
% dy=[0 0.332454946250000 0.198017059406250 0.0734163455546875 -0.0510670313701985 -0.168851648273860 -0.270865192726796 -0.348550483481562 -0.395200810620278 -0.406748398590689 -0.382201097182780 -0.323762604399792 -0.236650286304258 -0.128636329190141 -0.00935712148064122 0.110545636146184 0.220374543577469 0.310331564135441 0.372393027597958 0.401026261573602 0.393683796647517 0.351030956373549 0.276886467490983 0.177881407093338 0.0628669228432652 -0.0578763540568076 -0.173556593560093 -0.273834716498144 -0.349747909930254 -0.394510207344599 -0.404118623202944 -0.377710685470080]

6 个评论

Why do you want to remove the loop? Just include preallocation of the output array and it will be neat and readable code. Trying to remove this loop will not make the code clearer!
Thanks. But I want to vectorize the code for running it on GPU.
Stephen23
Stephen23 2016-2-1
编辑:Stephen23 2016-2-1
Sure, but you did not answer my question: why? Is the code slow? Are you preallocating the array before the loop? Not preallocating is the main cause of beginners complaining about slow code, just like in your example.
How big are the data arrays?
What would be the value for dy(1).. you are alloting diractly as dy(2). so may i send in the same way?
Stephen, the code is part of a GL-Derivative. After profiling the entire program, this part is taking the max time. So in order to reduce the execution time i am trying to implement it on GPU using gpuArray(). The data arrays depend on the time period t, which can be pi,2pi,3pi,..,npi. I have rephrased the question with the values of w & y for t=0:0.1:pi.
Syed, i am directly alloting dy(2) because its required that initially dy(1)=0.
Note that
dy(i)= mtimes(w(1:i),y(i:-1:1))
is the same as
dy(i) = dot(w(1:i), y(i:-1:1))
for which it is not necessary that y be a column vector

请先登录,再进行评论。

 采纳的回答

dy = sum(triu(w(:)*ones(1,numel(w))).*toeplitz([y(1);zeros(numel(y)-1,1)],y));
dy(1) = 0;

4 个评论

Cute. But I suspect that might not be efficient ;-)
Good answer, but seems more complicated than it needs to be. All you need is
dy = [w * triu(toeplitz(y))]';
dy(1) = 0;
Hi Joss! Your comment has a better answer. Thank you!

请先登录,再进行评论。

更多回答(1 个)

gpu_w = gpuArray(w(:));
gpu_yR = gpuArray( flipud(y(:)));
idx = gpuArray(2:length(t)).';
dy_2_onwards = arrayfun(@(K) dot(gpu_w(1:K), gpu_yR(end-K+1:end)), idx);
dy = [0; gather(dy_2_onwards)];
I also suggest comparing the timing of
gpu_w = gpuarray(w(:));
gpu_y = gpuarray(y(:));
idx = gpuarray(2:length(t));
dy_2_onwards = arrayfun(@(K) dot(gpu_w(1:K), gpu_y(K:-1:1)), idx);
dy = [0; gather(dy_2_onwards)];
In both of these, if you already know for sure that w and y are the same orientation then you could skip the (:) . dot on gpuArray might also be okay with vectors of different orientation; I do not have access to the documentation for it.
I would further compare to not using the GPU and instead using
dy = [0; arrayfun(@(K) dot(w(1:K), y(K:-1:1)), (2:length(t)).')]
as I suspect the overhead of using the GPU might not be worth the effort.
It also would not surprise me if a plain loop were faster than the arrayfun.

3 个评论

This won't work on the GPU because gpuArray/arrayfun only supports scalar operations. You can't index any more than one element, you can't call transpose, and you can't call dot.
Ah. But dot is listed as allowed in http://www.mathworks.com/help/distcomp/run-built-in-functions-on-a-gpu.html so it is not obvious why you would not be able to call dot from gpuArray/arrayfun ?

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Loops and Conditional Statements 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by