find(condition,1) is slower than using a loop--any way to speed up?

1 次查看（过去 30 天）

显示更早的评论

Marshall 2018-10-5

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/422435-find-condition-1-is-slower-than-using-a-loop-any-way-to-speed-up

编辑： Jan 2018-10-9

在 MATLAB Online 中打开

In general, using a logical condition in conjunction with find is slower than using a tight loop:

loc = find(X>a, 1)

is slower than using a for loop:

 for i = 1:N
     if(X(i)>a)
         loc = i;
         break;
     end
 end

The reason for this is because the logical operator X>a operates on all elements of the vector X, which is wasteful. Surely the JIT is smart enough to optimize

find(condition,1)

such that condition is applied and tested until the value is found. Do any shortcuts for this optimization exist?

33 个评论
显示 31更早的评论隐藏 31更早的评论

Bruno Luong 2018-10-6

编辑：Bruno Luong 2018-10-6

IMO, the number of patterns of logical expressions using by find(...'first') is much larger than the few (9) combinations in the matrix product for the parser to handle for JIT acceleration. There is an infinity "patterns" of logical expression in fact.

Of course they can peak few typical patterns and work on that, but this is unsatisfied solution for SW development.

Also another important difference with matrix product JIT opt is that for matrix JIT opt just have to call different ready BLAS routine for each case the parser detects. For FIND, ANY, ALL, what we ask is possibility of breaking during the output array is being formed, which is a deep down intervention, and it requires an entirely different way of evaluation of array (logical) operations than MATLAB base design.

The true is MATLAB by design has all those limitations, and makes it much slower than some real computer programming languages. It is use for prototyping, not for viable industrial SW solution. Just take it as it is.

Marshall 2018-10-8

编辑：Marshall 2018-10-8

在 MATLAB Online 中打开

My request was to optimize find so that it doesn't perform a logical comparison on extra rows/elements, not to optimize how to compute some 2D array. Your example is entirely irrelevant. Optimization is not all-or-nothing. Compilers almost always have flags that enable or disable varying levels of optimization.

My initial request was to optimize find such that, when paired with a logical operator, it did not produce a full-sized logical matrix and then begin the search. It said nothing about optimizing the code that produces the operands of the logical operator.

What I am requesting is that this be applied to the following statement:

find(X > y,1,'first')

Without any regard to how X is computed. Replace > with whatever logical operator you want. The point is that the comparison "X > y" is only performed as many times as needed.

You are claiming "but that's not fully optimized because you could move the logical operator inside the code that produces X and optimize even more." That is true, but it is an entirely different level of optimization which is probably not even possible.

Bruno Luong 2018-10-8

编辑：Bruno Luong 2018-10-8

Funny you said my claim of the 2D example can be optimized, I show you that the optimization is very partial, then now now claim the example is irrelevant... OK if you don't to discuss it anymore.

I already said that your original problem can be fixed by a specific MEX file (and that's not hard to do really).

But if you want to ask TMW to improve FIND(...'FIRST') to fix the drawback, this is a very challenging for them if they consider in the general framework, because it goes against MATLAB basic architecture. Behind the FIND command is the list of mxArray* input arguments, there is nothing done as a compiler that make an explicit assembler for-loop behind X>y; they are (X and y) manipulated as array objects, period.

There is very little chance that they would stop at a level just to solve your problem. They decided to stop at level 0, meaning FIND(...'FIRST') works on input array that is fully built. If that's not enough for you, then too bad, you just have to program a for-loop to take advantage of being able of breaking earlier.

That's my guess, you might or might not agree, up to you.

Please just go ahead and submit the enhancement request.

Bruno Luong 2018-10-9

编辑：Bruno Luong 2018-10-9

在 MATLAB Online 中打开

Well a low-level implementation I would expect it can cut down by a factor of 4-5, as show this case where FIND(...'FIRST'), MATLAB for-loop and MEX implementation must do the same number of iterations. But yeah for-loop is reasonably fast already (especially if the search stops earlier), just depends on what one want expect.

Again, to me MATLAB is not a standard SW one should expect to use not for speed.

X=zeros(1,1e7);
X(end)=10;
a=1;
tic
for i = 1:length(X)
    if(X(i)>a)
        loc = i;
        break;
    end
end
toc % 0.040234 seconds.
tic
loc = find(X>a, 1, 'first');
toc % 0.009267 seconds.
tic
loc = findfirst_gt(X, a);
toc % 0.008053 seconds.

with findfirst_gt is the MEX file

/////////////////////////////////////////////////////////////////////////
// mex function findfirst_gt.c
//////////////////////////////////////////////////////////////////////////
#include "mex.h"
#include "matrix.h"
void mexFunction(int nlhs, mxArray *plhs[],int nrhs, const mxArray *prhs[])
{
    double *x, a;
    int i, loc, n;
      a = mxGetScalar(prhs[1]);
      x = mxGetDoubles(prhs[0]);
      n = mxGetNumberOfElements(prhs[0]);
      loc = 0;
      for (i=0; i<n; i++)
      {
          if (x[i] > a)
          {
              loc = i+1;
              break;
          }
      }
      plhs[0] = mxCreateDoubleScalar(loc); 
}

Bruno Luong 2018-10-9

编辑：Bruno Luong 2018-10-9

在 MATLAB Online 中打开

What release do you run? Mine is R2018B and I run the script 10 times to be sure.

>> test
Elapsed time is 0.043561 seconds.
Elapsed time is 0.009292 seconds.
Elapsed time is 0.009270 seconds.
>> test
Elapsed time is 0.045115 seconds.
Elapsed time is 0.009850 seconds.
Elapsed time is 0.009204 seconds.
>> test
Elapsed time is 0.045022 seconds.
Elapsed time is 0.009694 seconds.
Elapsed time is 0.008992 seconds.
>> test
Elapsed time is 0.043536 seconds.
Elapsed time is 0.009819 seconds.
Elapsed time is 0.009164 seconds.
>> test
Elapsed time is 0.041917 seconds.
Elapsed time is 0.010048 seconds.
Elapsed time is 0.008875 seconds.
>> test
Elapsed time is 0.042409 seconds.
Elapsed time is 0.009399 seconds.
Elapsed time is 0.008856 seconds.
>> test
Elapsed time is 0.042002 seconds.
Elapsed time is 0.009677 seconds.
Elapsed time is 0.008934 seconds.
>> test
Elapsed time is 0.041580 seconds.
Elapsed time is 0.010062 seconds.
Elapsed time is 0.009134 seconds.
>> test
Elapsed time is 0.041930 seconds.
Elapsed time is 0.009674 seconds.
Elapsed time is 0.008826 seconds.
>> test
Elapsed time is 0.042011 seconds.
Elapsed time is 0.009334 seconds.
Elapsed time is 0.009076 seconds.
>>

Bruno Luong 2018-10-9

编辑：Bruno Luong 2018-10-9

function calls: that implies some small mxArray header copying, etc... we talking about sub µs by calling here.

Jan 2018-10-9

编辑：Jan 2018-10-9

@Bruno: In the worst case, all X and compared in the Mex function and in find(X>a, 1). That the latter is faster might be caused by MMX/SSE code, which checks 8 or 16 logicals at once. SSE could be used for the comparison also, but the code will be much larger and has to catch the exceptions that the mxArray does not start or end at a multiple of the cache-line size. If we take into account the runtime and teh programming+debug time of the code, your simple C-Mex function is likely to be optimal. Please post is as an answer, because it solves the problem.

请先登录，再进行评论。

请先登录，再回答此问题。