extremely inefficient piece of code ... bug?
3 次查看(过去 30 天)
显示 更早的评论
Michal
2023-8-31
I have two codes that implement exactly the same task:
idx(idx<=0) = 1;
idx(idx>=n) = n-1;
and
if idx <= 0
idx = 1;
elseif idx >= n
idx = n-1;
end
where idx and n are a scalar integer values.
This piece of code is evaluating multiple times (N = 1e6, for example). During profiling of the code, I found very strange inefficiency of the first code, which is about ~ 100-1000x slower than the 2nd one. I use the 1st type of code to get more readable code by avoiding if-else construct, but the overall performance is on R2023a really terrible. See attached test code (test.m).
Is this behavior normal, or is it a BUG?
Please, could you verify this code on other (older) MATLAB releases to see, if the problem is only R2023a related or not?
24 个评论
Bruno Luong
2023-8-31
编辑:Bruno Luong
2023-8-31
Why it's a bug? The first code is not efficient since it calls array comparison + logical array + array indexing + array sub assigment. each function have to create intermediate MATLAB array result, decide which branch code it goesn, and perhaps doing some deep duplication of data (?), garbage collector, etc...
Whereas the second code the JIT just do the work as simple as possible.
It is not a mistery or unusual that the first code is inefficient.
To me the first code doesn't look readable at all, it is very missleading for scalar input.
Michal
2023-8-31
编辑:Michal
2023-8-31
@Bruno Luong I did not say, that it is a BUG. I just asking for any opinions.
idx <= 0 or idx >= n is not an array comparison, due to the fact, that idx and n are scalar values.
If you were right, the code
if idx <= 0
idx(true) = 1;
elseif idx >= n
idx(true) = n-1;
end
would be as fast as the original 1st code. But it's not, it's still significantly faster!
Bruno Luong
2023-8-31
编辑:Bruno Luong
2023-8-31
BTW you repeat the uppercase BUG in two places.
Yes idx <= 0 if expressed inside the idexing would call function le that accepts arrays and returns a logical array, even though you have a scalar 1x1 array. It is NOT inside the if conditional test staement. Function le supposes to work on array so doing all kind of fancy stuffs that you don't see behind MATLAB compact syntax.
John D'Errico
2023-8-31
编辑:John D'Errico
2023-8-31
So only after I wrote my answer explaining why the two are different, do I learn that idx is a scalar. I give up. I cannot read your mind. I've deleted my answer, since it presumed you were using vectors, as the indexing lines only made sense if idx was a vector.
This is not a bug. You are using indexing to do something that is designed to be efficient for vectors, but no vectors were involved.
On a scalar, the simple if test will of course be more efficient. It does not even perform the second test and branch at all if the first branch is executed. AND it has always been that way, although the if test has become more efficient over the years, since they have improved the speed of the parser to write better code internally. Those changes in the parser for efficiency were made mainly almost certainly long before you ever learned MATLAB though.
Bruno Luong
2023-8-31
编辑:Bruno Luong
2023-8-31
When I'm lazy I do this
idx = max(min(idx,n-1),1);
or
idx = min(max(idx,1),n-1);
rather than if-else. Never cross my mind to do array logical indexing for scalar.
Strictly speaking this is NOT equivalent to either since your methods let NaN untouched, not mine (and in this case the two above min/max methods return different results for NaN).
When I need efficient code, I use if/else.
Michal
2023-8-31
@Bruno Luong Nice ... but still about 10x slower than if-else construct.
So, in the summary, I learned a few things:
- probably due to the continious JIT improvements, the low-level (not-vectorised) approaches (like if-else construct) will be more and more efficient
- using array oriented operations on scalar values is not a good idea at all
- this is definitely not a BUG, but frankly speaking, looks like a potencial BUG :)
Rik
2023-8-31
I was going to suggest something like Bruno posted (but was interupted before I hit submit).
The only thing to watch out for is that this code may not work in edge cases (like NaN, but also with non-integers).
idx=0.5;n=0;
Bruno(idx,n),Michal(idx,n)
1
-1
idx=1;n=1;
Bruno(idx,n),Michal(idx,n)
1
0
function Bruno(idx,n)
idx = max(min(idx,n-1),1);
disp(idx)
end
function Michal(idx,n)
if idx <= 0
idx(true) = 1;
elseif idx >= n
idx(true) = n-1;
else
error('undefined') % always include an else branch
end
disp(idx)
end
Rik
2023-8-31
Why do you insist on the word bug? A bug is unexpected/undefined/undocumented behavior. Perfomance isn't a bug unless the difference is catastrophic. If the indexing-style took 5 seconds to complete (instead of miliseconds), that would be a bug. Right now it is just behavior you did not expect.
I think the most valuable lesson here is this: write code the way Mathworks engineers expect you to. That way the improvements in the JIT (or Execution Engine, same difference) will benefit your code as well. Your coding style may or may not be intrinsically better or clearer. If you care about optimal performance on a broad range of releases, use the style Matlab expects.
Bruno Luong
2023-8-31
编辑:Bruno Luong
2023-8-31
To get it straight the time needed for
idx(idx<=0) = 1;
idx(idx>=n) = n-1;
is about 0.5 microsecond.
Michal
2023-8-31
编辑:Michal
2023-8-31
@Rik OK ... you are right regarding bug definition.
Is there available any comprehensive and clear description of what the "MATLAB Expects" programming style is? From my point of view, there is no generally recommended MATLAB programming style at all. There are only a myriad of recommendations spread over documentation and other MATLAB community sources. And these recommendations are sometimes contradictory or at least a bit confusing.
Michal
2023-8-31
@Bruno Luong Yes and time needed for
if idx <= 0
idx = 1;
elseif idx >= n
idx = n-1;
end
is about 0.5 nano seconds, I did not see you point...
Bruno Luong
2023-8-31
编辑:Bruno Luong
2023-8-31
I don't want to make any point beside giving a more precise timing for those who read with Rik statement 'If the indexing-style took 5 seconds to complete (instead of miliseconds)". It's submicrosecond range.
Michal
2023-8-31
The situation looks completely different, when you are using this piece of code many times. My MonteCarlo code use this piece code about 1e8 times for one simulation run, so the overall difference is 1min vs 0.3 sec. This is a huge difference when I need to perform about 1e3 separate MC runs.
Rik
2023-8-31
For the purposes of my comment microsecond and nanoseconds are the same. My point was that both methods are so fast that it doesn't matter. For one to be slow enough to be classified as a bug, they would have to be unexpectedly slow.
There are actually some recommended methods gathered in some places. There are cheat-sheet pdfs (although I can never find them, nor do I need them anymore) provided by Mathworks.
More interestingly: besides the scattered (and sometimes contradictory) advice in the documentation, there is this thread.
I don't know where in the documentation I would mention that indexing of a scalar is not recommend. Where would you do so?
Bruno Luong
2023-8-31
编辑:Bruno Luong
2023-8-31
Don't tell me that you mainly do scalar clipping on your montecarlo simulation.
Rik
2023-8-31
To put into other words what Bruno (probably) meant: this smells like a situation where you should work with arrays and do logical indexing.
Michal
2023-8-31
编辑:Michal
2023-8-31
@Bruno Luong Yes, I do :) I am simulating behaviour of random signals for different types of linear interpolations between sampling time points, and "scalar clipping" plays here relatively crucial role at fast linear single-point query interpolation, which takes about 30-50% of the whole processing time.
Single point query linear interpolation leeds to the scalar "idx"!
Michal
2023-8-31
@Bruno Luong Well OK... You are asking and I am giving you an answer. This is the perfect closing statement, Bruno. Especially if you don't know anything about the topic at hand.
Bruno Luong
2023-8-31
Cheese: For the record: My timing figure is directed to Rik's comment, then you continue to draw me in with your justification about your montecarlo simulation, that I initially never ask for. I'm not really interested in knowing the topic to be honest. That's your work not mine. You don't need to attck me with "Especially if you don't know anything about the topic at hand."
Michal
2023-8-31
编辑:Michal
2023-8-31
@Bruno Luong I just react on your statement: "Don't tell me that you mainly do scalar clipping on your montecarlo simulation."
I tried to explain to you why I use "scalar clipping" in my MC simulation. You finally comment my explanation by: "Noise interpolation has no sense to start with...."
So, what ...?
采纳的回答
Matt J
2023-8-31
编辑:Matt J
2023-8-31
Well, I think the bottom line is just that the JIT does not have any optimization for indexing expressions like idx(idx<=0). That might be because such an optimization would need to know in advance that idx is a scalar, and remains a scalar throughout the loop.
Coversely, parsing an if-statement,
if idx>threshold
end
doesn't require nearly as much work, even when idx is a vector. The if test is done by looping through the vector elements idx(i)>threshold, and as soon as one of the elements is false, it bails out.
5 个评论
Rik
2023-8-31
In short: yes, it is extremely likely all due to the JIT.
And to your second question: no, it isn't insane, just wasteful and a subjectively odd way to write the code.
I honestly don't understand why that isn't obvious when looking at the code. You produce a logical array and use it to index a variable (possibly with an all-false array). Compare that to the evaluation of an if-statement. While 1000x is a bit surprising to me, anything below 20x would have made me doubt the timing test.
Matt J
2023-8-31
Is using of idx(idx<=0) for scalar idx so insane as others suggest?
It would definitely be considered a rare usage of logical indexing. It doesn't surprise me that MathWorks never thought to optimize that case.
Michal
2023-8-31
This is only a special (scalar) case of general array logical indexing... Why should be this special case penalized by absence of optimization?
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Arithmetic Operations 的更多信息
标签
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!发生错误
由于页面发生更改,无法完成操作。请重新加载页面以查看其更新后的状态。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
亚太
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)