Fast calculation of short cumulative product vector

Question

Knut 2011-9-6

1
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/15220-fast-calculation-of-short-cumulative-product-vector

I want to calculate a reasonable short vector many times with variable inputs a and len

%y = [a a^2 a^3...a^(len)]

My methods so far are the following:

y1 = cumprod(a(ones(len,1)))
y2 = a.^(1:len)

y1 seems to be faster than y2 for len>11 or so. Any suggestions for doing this even faster? The knowledge that y(n) = y(n-1)*y(1) should be exploitable somehow, I just dont see how.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

James Tursa 2011-9-6

2
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/15220-fast-calculation-of-short-cumulative-product-vector#answer_20670

在 MATLAB Online 中打开

Or this variation, which I don't see in the current posts:

y = cumprod(a.*ones(1,n));

I think the MATLAB JIT may recognize this formulation and not actually do the multiply, but just populate the result ala repmat.

And if this calculation is dominating your overall runtime, you can always turn to a simple mex function to do it and avoid the imtermediate array formulations.

2 个评论
显示无隐藏无

Andrei Bobrov 2011-9-6

now the fastest variant, +1

Oleg Komarov 2011-9-7

+1

请先登录，再进行评论。

Answer 2

Oleg Komarov 2011-9-6

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/15220-fast-calculation-of-short-cumulative-product-vector#answer_20655

在 MATLAB Online 中打开

time = zeros(100,3);
for len = 1:100
    a = rand(1);
    o = ones(1,len);
    v = 1:len;
      tic
      for s = 1:1000; y1 = cumprod(a(o)); end
      time(len,1) = toc;
      tic
      for s = 1:1000; y2 = a.^(v); end
      time(len,2) = toc;
      tic
      for s = 1:1000; y3 = bsxfun(@power,a,v); end
      time(len,3) = toc;
  end
plot(time)
legend('cumprod','.^','bsxfun')

EDIT cumprod, power, explog, cumprodx, cumprodpow

James' cumprodpow is the fastest.

num_iter = 3000;
time = zeros(100,5);
for len = 1:100
    a = rand(1);
    tic
    for s = 1:num_iter
        y1 = cumprod(a(ones(1,len)));
    end
    time(len,1) = toc;
    pause(0.001)
    tic
    for s = 1:num_iter
        y2 = a.^(1:len);
    end
    time(len,2) = toc;
    pause(0.001)
    tic
    for s = 1:num_iter
        y3 = exp(log(a)*(1:len));
    end
    time(len,3) = toc;
    pause(0.001)
    tic
    for s = 1:num_iter
        y4 = cumprodx(a,len);
    end
    time(len,4) = toc;
    pause(0.001)
    tic
    for s = 1:num_iter
        y5 = cumprod(a.*ones(1,len));
    end
    time(len,5) = toc;
end
plot(time)
legend('cumprod','power','explog','cumprodx','cumprodpow')

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Andrei Bobrov 2011-9-6

Hi friends!

Firstly, +1;

secondly, small additions:

time = zeros(100,5);

for len = 1:100

a = rand(1);

o = ones(1,len);

v = 1:len;

tic

for s = 1:1000; y1 = cumprod(a(o)); end

time(len,1) = toc;

tic

for s = 1:1000; y2 = a.^(v); end

time(len,2) = toc;

tic

for s = 1:1000; y3 = bsxfun(@power,a,v); end

time(len,3) = toc;

tic

for s = 1:1000; y4 = arrayfun(@(x)a^x,v); end

time(len,4) = toc;

tic

for s = 1:1000, for v = len:-1:1, y5(v) = a^v; end; end

time(len,5) = toc;

end

plot(time)

legend('cumprod','power','bsxfun','arrayfun','for..end')

请先登录，再进行评论。

Answer 3

Grzegorz Knor 2011-9-6

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/15220-fast-calculation-of-short-cumulative-product-vector#answer_20652

在 MATLAB Online 中打开

My suggestion:

y3 = bsxfun(@power,a,1:len)

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 4

Derek O'Connor 2011-9-6

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/15220-fast-calculation-of-short-cumulative-product-vector#answer_20714

在 MATLAB Online 中打开

Nobody has followed Knut's suggestion:

y(n) = y(n-1)*y(1)

It is in fact a special case of Horner's Method for evaluating a polynomial

yn(x) = a0 + a1*x + a2*x^2 + ... + an*x^n.

The time behavior of the two methods below is very erratic but the loop or Horner method always seems to beat the CumProd method by 30% to 50% up to about n = 150. After that, CumProd gets better and better. At n = 1000, Horner's time is about 5*CumProd time.

If you have the time try [yc,yh,time] = CumProdvsHorner(1000, 10000);

Has anyone got an explanation for the spikes in the execution times?

 function [yc,yh,yx,time] = CumProdvsHorner(nmax,ns);
 % Calculation of y = [a a^2 ... a^n]
 % USE: [yc,yh,yx,time] = CumProdvsHorner(300,1000);
 nmeths=3;
 time  = zeros(nmax,nmeths);
 ntime = zeros(nmax,nmeths);
 for n = 2:nmax
    x = rand; xsave = x;
    yc = zeros(1,n); 
    yh = zeros(1,n);
    yx = zeros(1,n);
    tc = tic;
    for s = 1:ns
        yc = cumprod(x.*ones(1,n)); 
    end
    time(n,1) = toc(tc);
    tr = tic;
    x = xsave;
    for s = 1:ns
        yh(1) = x;
        for k = 2:n
            yh(k) = x*yh(k-1);
        end
    end
    time(n,2) = toc(tr); 
    tx = tic;
    for s = 1:ns
        yx = cumprodx(x,n); 
    end
    time(n,3) = toc(tx);    
 end % for n
 figure
 plot(20:nmax,time(20:nmax,:))
 legend('CumProd','Horner','CumProdx')
 xlabel('n');
 ylabel('Time (secs)')
 title('Calculate y(x,n) = [x x^2 x^3 ... x^n]')
 for m = 1:nmeths
    ntime(:,m) = time(:,m)./(1:nmax)';
 end
 figure
 plot(20:nmax,ntime(20:nmax,:))
 legend('CumProd','Horner','CumProdx')
 xlabel('n');
 ylabel('Normalized Time T(n)/n')
 title('Calculate y(x,n) = [x x^2 x^3 ... x^n]')
 %
 % END CumProdvsHorner

EDIT: Added plots for normalized times, renamed some variables.

5 个评论
显示 3更早的评论隐藏 3更早的评论

bym 2011-9-7

also spikes are different for Horner & CumProd

Derek O'Connor 2011-9-7

@proecsm

Take a look at the normalized times with the updated function using:

[yc,yh,time] = CumProdvsHorner(1000,1000);

CumProd's behavior is very strange. As Walter says, this may be due to switching in and out of BLAS.

请先登录，再进行评论。

Answer 5

James Tursa 2011-9-7

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/15220-fast-calculation-of-short-cumulative-product-vector#answer_20719

在 MATLAB Online 中打开

Here is the quick and dirty mex solution:

// File cumprodx.c
// c = cumprodx(a,n) does equivalent of c = cumprod(a.*ones(1,n))
// a = real double scalar
// n = real double scalar with integral value > 0
// Caution: No argument checking
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    double a = mxGetScalar(prhs[0]);
    mwSize i, n = mxGetScalar(prhs[1]);
    double *pr;
    plhs[0] = mxCreateDoubleMatrix(1,n,mxREAL);
    pr = mxGetPr(plhs[0]);
    pr[0] = a;
    for( i=1; i<n; i++ ) {
        pr[i] = pr[i-1] * a;
    }
}

3 个评论
显示 1更早的评论隐藏 1更早的评论

James Tursa 2011-9-7

For the smaller values of n the overhead plays a significant factor as a percentage of the overall timing. The timing advantage of Horner's method in the testing isn't because it is inherently a faster method, but because it avoids the overhead of function calls. If you place Horner's method inside a function I think you will see it suddenly become one of the slowest methods. I am not knocking your previous timing tests ... they are valid and the fact that one can code a method without function calls *is* a real advantage. I am just pointing out that the timing comparison is actually being dominated by function call overhead in the small n cases.

Derek O'Connor 2011-9-7

A good point. I had overlooked the function call overhead, which

can be a significant part of the time to do the necessary (n-1)

multiplications.

However, the timings serve as a warning not to forget the simpler,

loopy methods, especially with (an improving?) JIT compiler.

Vectorization has become something of a fetish with some Matlab

programmers, which blinds them to simpler methods.

请先登录，再进行评论。

Answer 6

Knut 2011-9-7

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/15220-fast-calculation-of-short-cumulative-product-vector#answer_20743

在 MATLAB Online 中打开

Thank you for all your help. There seems to be many different ways to accomplish this, each with different speed-tradeoffs. The mex-file might run faster with single-presicion and optimization flags, but I did not bother.

For lengths < ~80 elements, 'Horners Method' seems to be the best choice. I did notice some strange "warm-up" effects: increasing the number of iterations caused the relative performance to shift quite a lot. Is this the JIT playing games on me?

num_iter = 100;
len_span = [1:50 100:100:2000 1e4 3e4 1e5 3e5 1e6];
num_iter_span = 1e6./len_span;
time = zeros(length(len_span),7);
for len_idx = 1:length(len_span)
  num_iter = num_iter_span(len_idx);
  len = len_span(len_idx);
  eps = 1e-7;
  a = rand(1);
    tic
    for s = 1:num_iter;         
        y1 = cumprod(a*ones(1, len)); 
    end
    time(len_idx,1) = toc;
    assert(max(abs(y1-y1))<eps);
    tic
    for s = 1:num_iter; 
        y2 = a.^(1:len); 
    end
    time(len_idx,2) = toc;
    assert(max(abs(y1-y2))<eps);
    tic
    for s = 1:num_iter; 
        y3 = bsxfun(@power,a,1:len); 
    end
    time(len_idx,3) = toc;
    assert(max(abs(y1-y3))<eps);
    tic
    for s = 1:num_iter, 
        for v = len:-1:1
            y4(v) = a^v; 
        end; 
    end
    time(len_idx,4) = toc;
    assert(max(abs(y1-y4))<eps);
    tic
    for s = 1:num_iter, 
        y5(1) = a;
        for v = 2:len 
             y5(v) = a*y5(v-1);
        end; 
    end
    time(len_idx,5) = toc;
    assert(max(abs(y1-y5))<eps);
    tic
    for s = 1:num_iter, 
        y6 = cumprodx(a,len);
    end
    time(len_idx,6) = toc;
    assert(max(abs(y1-y6))<eps);
    tic
    for s = 1:num_iter, 
        y7 = exp(log(a)*(1:len));
    end
    time(len_idx,7) = toc;
    assert(max(abs(y1-y7))<eps);
end
figure
semilogx(len_span, time)
xlabel('vector length')
ylabel('time/vector length')
grid on
axis tight
legend('cumprod(a([1 1,...]))','power','bsxfun','for..end', 'Horners Method', 'mex', 'log')

3 个评论
显示 1更早的评论隐藏 1更早的评论

Knut 2011-9-7

Because this is similar to how I use the code in my application:

len_vect = [9 20 19 42 50];

a_vect = [0.7 0.8 0.9 1 1.1];

for t = 1:length(len_vect)

y = a_vect(t).^(1:len_vect(t));

end

As both a and len change for each iteration, it makes sense to include them within the tic and toc: I dont see how I can hide the cost of creating the vectors outside the loop?

Oleg Komarov 2011-9-7

You didn't mention that in your original post.

Anyways, in this case I find that James' cumprod(a.*ones(1,len)) is the fastest, always.

请先登录，再进行评论。

Fast calculation of short cumulative product vector

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（6 个）

2 个评论
显示无隐藏无

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

5 个评论
显示 3更早的评论隐藏 3更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

Fast calculation of short cumulative product vector

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（6 个）

2 个评论 显示 无隐藏 无

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

5 个评论 显示 3更早的评论隐藏 3更早的评论

3 个评论 显示 1更早的评论隐藏 1更早的评论

3 个评论 显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

5 个评论
显示 3更早的评论隐藏 3更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论