How do I avoid getting fooled by 'implicit expansion'?

24 次查看(过去 30 天)
I am trying my very best not to be a grumpy old man here, but I just wasted the better part of an hour because of the addition of 'implicit expansion', and I need to hear from the proponents of this feature how to live with it. The offending bit of code was:
smoothed=smooth(EEG.data(iC,:),EEG.srate*60,'moving');
deviation=EEG.data(iC,:)-smoothed;
this bit of code kept giving me a 'memory error', which was odd, since 'deviation' should be a lot smaller than 'EEG'. However, EEG is very large, and I don't have room for two of them in my memory, so I figured there might be some intermediate step in calculation that was tripping me up, or perhaps windows was hogging the resources, or something else out of my control. It took two restarts of matlab, and one reboot of the pc, and finally a complicated rewrite of the script (which still didn't fix it) to finally realize that 'smooth' (which is a matlab built-in) was changing the dimensions, such that I was subtracting a column vector from a row vector.
What is the good coding practice for this not to occur? I would have thought that having a language that complained when you performed an ill-defined operation was the good solution to this problem, but I can see from my google search that apparently a lot of people think that 'implicit expansion' is a great good. How do you avoid pitfalls such as this? (please note that if I hadn't been running out of memory, I might have made it a good deal further down the script before noticing that something was off).
Should I just never trust any command to preserve the dimensions of my arrays, even if it's an inbuilt one?
  7 个评论
kaare
kaare 2022-2-15
@John D'Errico, why the high horse? When I wrote the original question, I was a veteran matlab user. It’s been four years more now, and I still hate IE. Just because people disagree with you, doesn’t mean you get to write off their opinion as a sign of inexperience. That’s just you being arrogant and presumptuous.
In the original example, I knew what the size of my array was (it was a row). I was expecting matlab not to change it for me. And I certainly wasn’t expecting my code to break because matlab changed the dimensions. I was used to matlab not changing it for me (in 2015, the ‘deviation’ variable would have had exactly the size of EEG.data(iC,:)).
But, sure, IE is here to stay. I’m not. I’m slowly moving my (extensive) code base out of matlab. Obviously, not just due to this infuriating little feature, but the lack-of-a-solution to problems like this certainly hasn’t made me stay any longer. You get to keep your party to yourself, and to be as condescending about it as you need to.
Jan
Jan 2022-2-16
@Alexander Thomas: "make it optional to implicitly expand" - No, this cannot work. Remember, that Matlab's toolbox function expect an enabled implicit expansion. If a switching is introduced, each and every toolbox function must store the former setting, enable the expansion and restore the former value finally. This would waste too much time.
Many of the codes, I write here in the forum to solve questions make use of the implicite expansion. In the first 3 years I've commented this by "% Auto-expanding, >= R2016b" and added a comment with the corresponding bsxfun call. Today Most of the Matlab users in this forum are familiar with this feature. Removing it or even allowing to switch it of manually would cause serious incompatibilities.
It was always a feature and a problem, that Matlab tries to be smart. Prefering to operator along the first nonsingelton dimension is convenient and dangerous, because beginners tend to forget, that the dimensions can differ from their expectations. The length command was a really bad idea, findstr(a,b) also: It searched the shorter in the longer of the elements. Of course "hold on" save some keyclicks compared to "hold('on')", but the non-functional form of commands was a source of bugs frequently in the past, when Matlab's guessing fails, if the argument is a char vector or number. The command plot(1:10, rand(1,10)) creates an axes automagically and a surrounding figure as well - except if there is an existing already.
Implicite expanding is another smart feature. It would have been a more secure decision to introduce new operators like $+, $*, etc. But MathWorks decided for making it transparent. I was not happy about it, because I prefer a program to stop, if something unexpected happens, but the auto-magic produces an unexpected result instead. But it the auto-expanding is applied intentionally, it is an efficient and powerful tool.
@John D'Errico: "Implicit expansion is a tool that as you gradually become a more experienced user, you will appreciate" - I do not agree. I'd prefer an explicit operator instead of increasing the power of existing operators. But, as you said: Such is life, especially as a programmer. It is part of Matlab now and I use it. The questions in this forum show, that the expanding is not a frequent cause of bugs, e.g. rand(1e6) appears more frequently.
@kaare: Thanks for caring about the tone in the forum. As I read John's answer, the most emotional part is: "Sigh". I do neither see a high horse, nor arrogance, nor presumptuousness, nor a condescending statement. Of course, politeness is the base of an efficiently working community.

请先登录,再进行评论。

采纳的回答

Jan
Jan 2018-2-19
编辑:Jan 2022-2-16
Should I just never trust any command to preserve the dimensions of my arrays,
even if it's an inbuilt one?
The shapes of the output of built-in function have been subject to changes in the past and I expect this to happen in the future also. Therefore I catch my assumptions about shapes in a "unit-test" like function. So when I write
smoothed = smooth(EEG.data(iC,:), EEG.srate*60, 'moving');
I add this to the test function:
y = smooth(rand(1, 100), 5, 'moving');
if ~isrow(y)
error('SMOOTH does not reply a row vector for a row vector input.');
end
Nevertheless, it is a lot of work to do this for all assumptions. In many cases I'm not even aware of what I assume, e.g. for strncmp('hello', '', 2), which has changed its behavior in the past also.
In your case it would have been smart and efficient, if
deviation = EEG.data(iC,:) - smoothed
causes an error. Unfortunately the implicit expansion tries to handle this smartly, but it is smarter than the programmer in many cases. When it is intended, the implicit expansion is nice and handy, but it is an invitation for bugs also. All we can do is to live with it, because it is rather unlikely that TMW removes this feature. But I cannot be bad to write this as an enhancement request to them.
To answer the actual question:
How do I avoid getting fooled by 'implicit expansion'?
Use Matlab < R2016b, at least for testing your code.

更多回答(4 个)

Guillaume
Guillaume 2018-2-19
Yes, some complain about implicit expansion. For me, it's a logical expansion (pun intended) of the 1-D scalar expansion to N-D. On the other hand, I was also happy with bsxfun.
As to your problem, at the heart it's a design failure on your part I'm afraid. If you never validate your assumptions, you can expect that things go wrong. Instead of a single large script, use functions. The first thing that a function should do is validate that its inputs are as expected. If two vectors are expected as input, check they're the same size, etc. Whenever I write a function, the first thing written is the help, with a clear listing of the input and their requirements, followed by actual validation of these requirements.
In your case, a simple
assert(size(v1) == size(v2))
would have caught the problem.
The implicit expansion forces you to be more careful about the shape of your vectors. In my opinion, it's not a bad thing.
  2 个评论
kaare
kaare 2018-2-19
... I'm sorry, but this is not very helpful advice. We are, literally, talking about two lines of code. Your suggestion amounts to checking my assumptions every time I call a built-in routine. Would you have placed 'deviation=EEG.data(iC,:)-smoothed;' inside a subfunction?
John D'Errico
John D'Errico 2018-2-19
编辑:John D'Errico 2018-2-19
Admittedly, I would not be using assert here to check dimensions. But the fact remains that I would KNOW the shape of my arrays. If you don't know positively what shape arrays and vectors are produced by any operation, then it pays to learn to be more careful.
When you use a function, make sure you know what it returns. Read the help. Try a test case, in case the help was not sufficiently clear for you.

请先登录,再进行评论。


Matt J
Matt J 2018-2-19
Well, the error message would have told you that the out-of-memory was originating in the line
deviation=EEG.data(iC,:)-smoothed;
I should think that checking the dimensions of the right hand side quantities in the debugger would have been your first step.
  1 个评论
kaare
kaare 2018-2-19
As it happens, my first step was to check the size (in bytes) of the ingredients to the offending line. That did not help me, since 'smoothed' was, as expected, a lot smaller than 'EEG'. It is very possible that next time I get this error, I'll remember that implicit expansion is a thing, and that the memory error might not in fact have anything to do with the size of the array.

请先登录,再进行评论。


Petorr
Petorr 2022-6-20
I would like the debugger to automatically highlight where implicit array expansion is taking place. Is that option available? For example:
Here, A might be 100*1 and B, 1*500 so the highlighting would let me know that these are compatible unequal array sizes and will be implicitly expanded.
  2 个评论
Steven Lord
Steven Lord 2022-6-21
No such option exists. Would you expect this option to always highlight that operation in your code? A might be 100-by-1 and B might be 1-by-500. But they may both be scalars. There are circumstances where a static analysis of the code could prove at parse-time that the operation will perform implicit expansion, but what about this one?
function z = computeProduct(x, y)
z = x.*y;
end
Should the .* in this code be highlighted or not?
Petorr
Petorr 2022-6-21
In this example, it would only be highlighted while debugging within computeProduct, as in putting a breakpoint at the line z=x.*y. It would be comparable to the hover-tip that shows array dimensions. I use that often, even though it depends on the program state. I see how it might seem a little inconsistent since all other highlighting is based on static parsing, but some variation of it might be worth considering. If I get so good with the implicit sizing that I never need this feature, I'll be sure to come back and comment ; )

请先登录,再进行评论。


Rav
Rav 2024-3-25
Ok, since I use eeglab I managed to trace your screw-up.
EEG.data is sort-of horizontal, but 'smooth' outputs a vertical array.
Actually, the error message also shows what's wrong, look at dimensions. That's not the expansion
Correct debig would be to call "size(smoothed)" and "size(EEG.data)".
Correct solution to your code is this:
deviation=EEG.data(iC,:)-smoothed.';
'smoothed' here is transposed - a difference of 2 symbols
A good coding practice is to run all unfamiliar and failing functions through console by hand and just look at the output - not just console output, but also environment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by