find(condition,1) is slower than using a loop--any way to speed up?
1 次查看(过去 30 天)
显示 更早的评论
Marshall
2018-10-5
In general, using a logical condition in conjunction with find is slower than using a tight loop:
loc = find(X>a, 1)
is slower than using a for loop:
for i = 1:N
if(X(i)>a)
loc = i;
break;
end
end
The reason for this is because the logical operator X>a operates on all elements of the vector X, which is wasteful. Surely the JIT is smart enough to optimize
find(condition,1)
such that condition is applied and tested until the value is found. Do any shortcuts for this optimization exist?
33 个评论
dpb
2018-10-5
I've not tested but I'd tend to suspect it might make a difference with length of the vector.
Of course, the relative time if it is simply naive implementation is dependent on the distribution of the element a in the vector.
Never hurts to submit enhancement requests; performance is a worthy subject as well as features.
Marshall
2018-10-5
It is absolutely dependent. The typical assumption should be that the position of a is uniformly distributed. When this is the case, the gain from using the loop increases as the length of the vector increases--it appears that recent versions of Matlab, in which tight loops have been optimized by the JIT, do pretty well in this case, and thus as X increases in length, the number of extra operations performed by the logical method increases.
Walter Roberson
2018-10-5
Marshall
2018-10-5
This isn't the same issue; Matlab addressed this issue a few years back with the find(X,1,'first') and find(X,1,'last').
Bruno Luong
2018-10-5
编辑:Bruno Luong
2018-10-6
I think dpb that the enhancement request would be a very challenging one for folks at TMW. ANY(), ALL() suffer the same issue, that doesn't have an easy short-cut break. MATLAB just construct the entire array parameter before calling the function, hard to by pass it.
@Marshall, the only way I see is write your MEX files, for case-to-case basis.
Walter Roberson
2018-10-6
find(condition,1,'first') still has the semantics of evaluating the condition fully before doing the finding.
However, it might require overloading a relation operator to prove it, as one could hypothesize that the case of find(constant relationship constant, 1, 'first') could have been optimized.
We can see from the find() documentation,
"When you execute find with a relational operation like X>1, it is important to remember that the result of the relational operation is a logical matrix of ones and zeros. For example, the command [row,col,v] = find(X>1) returns a column vector of logical 1 (true) values for v."
This is not explicit about the 1, 'first' case, but does tend to imply that we should expect the relationship to be fully evaluated first, same as everything else that is documented (except for && and || which are documented otherwise.)
Matt J
2018-10-6
编辑:Matt J
2018-10-6
But the contention is that the JIT should have special optimized exceptions for find() calls with certain patterns, much as it does for certain matrix operation sequences.
For example, A*B.' should in theory execute mtimes(A,transpose(B)), but we know that doesn't happen when A and B are built-in matrix types.
dpb
2018-10-6
编辑:dpb
2018-10-6
Whether implementation is/isn't hard isn't a criterion against which to judge whether or not to submit an enhancement request imo; that onus falls on TMW as to "how" altho I'm certain some at least preliminary thinking about the problem must enter into the selection process of "what" gets chosen from the zillions on the list.
Walter Roberson
2018-10-6
编辑:Walter Roberson
2018-10-6
I don't disagree that it would be good idea; I am disagreeing with Marshall's implication that Mathworks has already done this. find(condition,1,'first') has been documented since R14 (but not R13SP2) so it is not some kind of new change for efficiency.
Bruno Luong
2018-10-6
编辑:Bruno Luong
2018-10-6
IMO, the number of patterns of logical expressions using by find(...'first') is much larger than the few (9) combinations in the matrix product for the parser to handle for JIT acceleration. There is an infinity "patterns" of logical expression in fact.
Of course they can peak few typical patterns and work on that, but this is unsatisfied solution for SW development.
Also another important difference with matrix product JIT opt is that for matrix JIT opt just have to call different ready BLAS routine for each case the parser detects. For FIND, ANY, ALL, what we ask is possibility of breaking during the output array is being formed, which is a deep down intervention, and it requires an entirely different way of evaluation of array (logical) operations than MATLAB base design.
The true is MATLAB by design has all those limitations, and makes it much slower than some real computer programming languages. It is use for prototyping, not for viable industrial SW solution. Just take it as it is.
dpb
2018-10-6
That's an issue, certainly, in implementation but don't see any reason to throw out consideration of possibilities that could be incorporated even for subset of the entire problem space.
Bruno Luong
2018-10-6
编辑:Bruno Luong
2018-10-6
Surely they can. But imagine your software relies on some solid foundation (here is mxArray as objects), then you put some very special codes that go out and against of this solid foundation in order to grab few ms, on 0.01% of used-case. Imagine the cost of maintenance such thing year over year.
Just imagine a perfectly reasonable example of code like this operated on a complex 2D array z.
i = find(all(z.*conj(z),2) > 1, 1 'first')
I cannot imagine the effort they would put to be able to stop such operation in the middle.
No they won't do it IMO, but you can try to submit the enhancement of course.
dpb
2018-10-6
编辑:dpb
2018-10-6
I wasn't the one asking, Bruno... :)
Probably wouldn't ever get implemented, agreed...that doesn't mean necessarily one might not consider the question and possibilities at some length, however, before making a final decision...all that I suggested.
IF the OP wishes to submit the enhancement is up to him; clearly his best bet for getting any joy would be to do some pretty heavy lifting up front similar to submitting proposal to one of the working groups for a standardized language. There at least, one can find a pathway for communication with the WG mailing list and develop a feel for whether there is any chance of anything or not; whether one could engage TMW in such preliminaries is unknown to me; I'm guessing not much more likely than so...
As Walter pointed out, the way to get short-term relief is with a current mex file if that one does what the OP needs or to write one that does the specific interest case.
Bruno Luong
2018-10-6
编辑:Bruno Luong
2018-10-6
Not sure, Walter MEX FEX-link does nothing more or faster than the current stock FIND(...,1,FIRST');
What OP needs is a specific MEX for his problem (convert the MATLAB for-loop).
dpb
2018-10-6
编辑:dpb
2018-10-6
I think that's what I said... :)
I don't know if the FEX submittal helps for OP or not; it might/might not depending on just what it does, specifically; if it doesn't then the route he could take would be to build one for his particular use case, yes...if it saves enough to make it worth the effort.
If the use is in the bowels of a deeply-nested optimization routine or some such, it might well be so; if it's just observing the behavior in command line script consumes a few extra msec than optimum, then "not so much"...
ADDENDUM On looking, it won't help OP's question; it is specific only to the case of searching for first/last nonzero in the array; the logic test and conversion to double would have to be done prior to calling it so undoubtedly that will lose a fair amount of the benefit altho there might be some gain in then short-circuiting the test.
Marshall
2018-10-8
This optimization should not be too hard to implement for the general case when an element-wise comparison is going on. Bruno gave the following example:
i = find(all(z.*conj(z),2) > 1, 1 'first')
And claimed this would be difficult to optimize. Surely it would not. First, one computes a temporary variable:
Z = z.*conj(z);
One now is confronted with the following:
i = find(all(Z>1,2),1,'first');
This optimization is only difficult if the all function cannot be inlined by the JIT; if so, it probably fits into one of the 5 or so patterns that the optimization would solve, which include any and all, in additional to the relational operators. The extra processing is that all rows of Z are tested. Here's a potential optimization, although it might be slower because all() is called each iteration:
for i = 1:size(Z,1)
// this is probably slower than the full evaluation all(Z>1,2) because all() is called each iteration
if(all(Z(i,:)), break; end
end
Bruno Luong
2018-10-8
编辑:Bruno Luong
2018-10-8
"First, one computes a temporary variable: Z = z.*conj(z);"
Too late! You already make computer calculation more than necessary, because you already compute the entire 2D array!
You decide the optimization is to be smart only at the first level, but no, a correct optimization must go to the deepest level and break from there (in the example it should stop at the CONJ calculation).
Marshall
2018-10-8
编辑:Marshall
2018-10-8
My request was to optimize find so that it doesn't perform a logical comparison on extra rows/elements, not to optimize how to compute some 2D array. Your example is entirely irrelevant. Optimization is not all-or-nothing. Compilers almost always have flags that enable or disable varying levels of optimization.
My initial request was to optimize find such that, when paired with a logical operator, it did not produce a full-sized logical matrix and then begin the search. It said nothing about optimizing the code that produces the operands of the logical operator.
What I am requesting is that this be applied to the following statement:
find(X > y,1,'first')
Without any regard to how X is computed. Replace > with whatever logical operator you want. The point is that the comparison "X > y" is only performed as many times as needed.
You are claiming "but that's not fully optimized because you could move the logical operator inside the code that produces X and optimize even more." That is true, but it is an entirely different level of optimization which is probably not even possible.
Bruno Luong
2018-10-8
编辑:Bruno Luong
2018-10-8
Funny you said my claim of the 2D example can be optimized, I show you that the optimization is very partial, then now now claim the example is irrelevant... OK if you don't to discuss it anymore.
I already said that your original problem can be fixed by a specific MEX file (and that's not hard to do really).
But if you want to ask TMW to improve FIND(...'FIRST') to fix the drawback, this is a very challenging for them if they consider in the general framework, because it goes against MATLAB basic architecture. Behind the FIND command is the list of mxArray* input arguments, there is nothing done as a compiler that make an explicit assembler for-loop behind X>y; they are (X and y) manipulated as array objects, period.
There is very little chance that they would stop at a level just to solve your problem. They decided to stop at level 0, meaning FIND(...'FIRST') works on input array that is fully built. If that's not enough for you, then too bad, you just have to program a for-loop to take advantage of being able of breaking earlier.
That's my guess, you might or might not agree, up to you.
Please just go ahead and submit the enhancement request.
Matt J
2018-10-8
What I am requesting is that this be applied to the following statement: find(X > y,1,'first')
I still feel pretty sure that this optimization already exists in Matlab when you implement the search as a for-loop. You could request that find() expressions be similarly parsed, but that would only have advantages of shorter syntax, not performance.
Bruno Luong
2018-10-9
编辑:Bruno Luong
2018-10-9
Well a low-level implementation I would expect it can cut down by a factor of 4-5, as show this case where FIND(...'FIRST'), MATLAB for-loop and MEX implementation must do the same number of iterations. But yeah for-loop is reasonably fast already (especially if the search stops earlier), just depends on what one want expect.
Again, to me MATLAB is not a standard SW one should expect to use not for speed.
X=zeros(1,1e7);
X(end)=10;
a=1;
tic
for i = 1:length(X)
if(X(i)>a)
loc = i;
break;
end
end
toc % 0.040234 seconds.
tic
loc = find(X>a, 1, 'first');
toc % 0.009267 seconds.
tic
loc = findfirst_gt(X, a);
toc % 0.008053 seconds.
with findfirst_gt is the MEX file
/////////////////////////////////////////////////////////////////////////
// mex function findfirst_gt.c
//////////////////////////////////////////////////////////////////////////
#include "mex.h"
#include "matrix.h"
void mexFunction(int nlhs, mxArray *plhs[],int nrhs, const mxArray *prhs[])
{
double *x, a;
int i, loc, n;
a = mxGetScalar(prhs[1]);
x = mxGetDoubles(prhs[0]);
n = mxGetNumberOfElements(prhs[0]);
loc = 0;
for (i=0; i<n; i++)
{
if (x[i] > a)
{
loc = i+1;
break;
}
}
plhs[0] = mxCreateDoubleScalar(loc);
}
Bruno Luong
2018-10-9
编辑:Bruno Luong
2018-10-9
What release do you run? Mine is R2018B and I run the script 10 times to be sure.
>> test
Elapsed time is 0.043561 seconds.
Elapsed time is 0.009292 seconds.
Elapsed time is 0.009270 seconds.
>> test
Elapsed time is 0.045115 seconds.
Elapsed time is 0.009850 seconds.
Elapsed time is 0.009204 seconds.
>> test
Elapsed time is 0.045022 seconds.
Elapsed time is 0.009694 seconds.
Elapsed time is 0.008992 seconds.
>> test
Elapsed time is 0.043536 seconds.
Elapsed time is 0.009819 seconds.
Elapsed time is 0.009164 seconds.
>> test
Elapsed time is 0.041917 seconds.
Elapsed time is 0.010048 seconds.
Elapsed time is 0.008875 seconds.
>> test
Elapsed time is 0.042409 seconds.
Elapsed time is 0.009399 seconds.
Elapsed time is 0.008856 seconds.
>> test
Elapsed time is 0.042002 seconds.
Elapsed time is 0.009677 seconds.
Elapsed time is 0.008934 seconds.
>> test
Elapsed time is 0.041580 seconds.
Elapsed time is 0.010062 seconds.
Elapsed time is 0.009134 seconds.
>> test
Elapsed time is 0.041930 seconds.
Elapsed time is 0.009674 seconds.
Elapsed time is 0.008826 seconds.
>> test
Elapsed time is 0.042011 seconds.
Elapsed time is 0.009334 seconds.
Elapsed time is 0.009076 seconds.
>>
Matt J
2018-10-9
I am running R2018a, but I was running on the original version of your code, which had,
X(100)=10;
Bruno Luong
2018-10-9
编辑:Bruno Luong
2018-10-9
Put the test in FUNCTION, the for-loop gets JIT-optimization better, just 2.5 time slower than MEX
function test
X=zeros(1,1e7);
X(end)=10;
a=1;
ntest = 100;
t = zeros(3,ntest);
for n=1:ntest
tic
for i = 1:length(X)
if(X(i)>a)
loc = i;
break;
end
end
t(1,n) = toc;
tic
loc = find(X>a, 1, 'first');
t(2,n) = toc;
tic
loc = findfirst_gt(X, a);
t(3,n) = toc;
end
t = mean(t,2)*1e3;
fprintf('MATLAB for-loop: %1.6f ms\n', t(1));
fprintf('MATLAB FIND(...,''first''): %1.6f ms\n', t(2));
fprintf('MEX implementation: %1.6f ms\n', t(3));
The results are:
MATLAB for-loop: 18.959884 ms
MATLAB FIND(...,'first'): 7.229707 ms
MEX implementation: 7.569961 ms
Bruno Luong
2018-10-9
编辑:Bruno Luong
2018-10-9
Here is the result if the TRUE elements is at position 100, Matlab for-loop wins (I guess calling a function and argument verification inside FIND are relatively costly at this level)
MATLAB for-loop: 0.001281 ms
MATLAB FIND(...,'first'): 4.450052 ms
MEX implementation: 0.017391 ms
With a single modification to a test function
X(100)=10;
Bruno Luong
2018-10-9
编辑:Bruno Luong
2018-10-9
function calls: that implies some small mxArray header copying, etc... we talking about sub µs by calling here.
Jan
2018-10-9
编辑:Jan
2018-10-9
@Bruno: In the worst case, all X and compared in the Mex function and in find(X>a, 1). That the latter is faster might be caused by MMX/SSE code, which checks 8 or 16 logicals at once. SSE could be used for the comparison also, but the code will be much larger and has to catch the exceptions that the mxArray does not start or end at a multiple of the cache-line size. If we take into account the runtime and teh programming+debug time of the code, your simple C-Mex function is likely to be optimal. Please post is as an answer, because it solves the problem.
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Logical 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!发生错误
由于页面发生更改,无法完成操作。请重新加载页面以查看其更新后的状态。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
亚太
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)