logical indexing with a smaller array should throw a warning

9 次查看(过去 30 天)
I am a heavy user of logical indexing. I think the default behaviour of allowing indexing with a differently sized logical array without warning, see e.g.
is a very dangerous practice. Even worse, the documentation is actually wrong: you can over-index an array until the extra bits of the logical array is all-false. For example,
x = ones(10,1);
l = [ true ; false(15,1)];
x(l);
runs without any error or warning. If you are unlucky enough to have an array with mostly false at the end, you will probably not detect related bugs for a long time.
Logical indexing is mostly used to restrict data to a subset. The logical index is often generated by complex calculations, and then the restricted and resized dataset is processed further in the same piece of code. Often, different subsets and datasets are used within the same code, with different sizes. A significant number of bugs can be avoided if the size of the logical is checked against the array.
I strongly believe that there should exist an option to enable warnings about using incorrectly sized logical array for indexing.
Anybody has an idea on how to deal with this issue, apart from defining a function like
function r = sa(x,l)
if any(size(x) ~= size(l))
error('Incorrect assignment.');
end
r = x(l);
and littering the nice x(l) references with ugly sa(x,l) everywhere?
  6 个评论
Balint Takacs
Balint Takacs 2013-1-11
编辑:Balint Takacs 2013-1-11
@Sean: MATLAB does error with indices, but does not with logicals.
Consider the following:
x = rand(10,1);
small = x < 0.1;
large = x > 0.9;
x_middle = x(~small & ~large);
x_small = x(small); % correct version
% coder forgets which data space 'small' is in,
% and introduces a semantic bug ...
x_small = x_middle(small);
% ... which is not detected until rand()
% draws a vector with lots of small values (never in practice).
Because logical indexing is not bound to the data they are used on, like iterators in other languages, it is quite hard to track in mind which data space they are belonging to, especially if there are lots of them. These type of bugs are quite easy to add in practice.
Checking the size will not entirely solve this lack of semantic connection, but at least gives an opportunity to detect cases when there is a high probability of it to happen.
Andy
Andy 2024-10-9
I agree, a warning for size mismatch would be very useful. The warning could very well be "off" by default. In my case, most times it would be sufficient to switch the warning on, run the code once, and then switch it back off again, just to make sure no indexing bugs are present.
Your suggested function r = sa(x, l) only works for the case where the size of the data matrix and the index matrix are identical. But it would be nice to also detect size mismatches where logical indices are only operating in a subset of the dimensions, for example: y = x(:, idx). The only way I can think of to achieve this is to also require another input to sa(), that specifies the dimension(s), but now things are starting to get really messy. Clearly, a built-in warning would be a lot nicer.

请先登录,再进行评论。

回答(4 个)

Jim Svensson
Jim Svensson 2023-4-6
Matlab should definitely require that the logical indexing mask is exactly the same size as the data being indexed. Not doing so is very bad.

Jan
Jan 2013-1-10
Logical indexing is even not implemented efficiently. I'm going to publish a faster version in the FileExchange, but it handles the right hand side of assignments only. I do not know how to replace the left hand side assignment e.g. in:
L = rand(1, 100) < 0.5;
X = rand(10, 10);
X(L) = X(L) - 1; % MEXing the RHS is easy, but the LHS?!
any(size(x) ~= size(l)) is a bad idea, because it fails when x and l have a different number of dimensions, e.g. in the example above. Mixing of linear indexing and logical indexing is important and very useful.
I'm sure, that the behavior of the logical indexing will not be changed to support backward compatibility.
  2 个评论
Balint Takacs
Balint Takacs 2013-1-10
They should not change its behaviour, but they can add a warning which can be turned off. BTW I want the thing to fail when the number of dimensions are mismatching.
Jan
Jan 2013-1-10
编辑:Jan 2013-1-10
@Balint: I cannot believe that you want to get a strange error message about a bad usage of the eq operator. The test should not fail in case of a mismatch, but reply TRUE:
if ~isequal(size(x), size(l))
error('Incorrect assignment in logical index operation.');
end
Did you measure the time, which is required to ignore a warning? It is a surprisingly high overhead and when the warning would be enabled, the users might be confused by getting dozens of warnings from ocrrectly working toolbox functions.
Therefore I suggest to test the dimensions explicitly, instead of injecting this extra test in the standard functionality.

请先登录,再进行评论。


Jonathan Sullivan
Jonathan Sullivan 2013-1-10
This is an interesting proposition. While I'm very much against throwing an error in this case, I would be open to having a warning issued. But not in the case of the sizes being different, but rather only when the number of elements are different.
I have been known to use column vectors to index row vectors and vice versa, and I think that is OK. But I do want to be made aware when I'm using a 100 element logical array to index a vector that has 150 elements.
Something like:
if numel(x) ~= numel(l)
warning('Logic Index array has a different number of elements than the array being indexed.');
end

Matt J
Matt J 2013-1-10
I tend to agree with you about the dangers. If it's a feature, it's one I have never had use for in many years of using MATLAB. The only rationale for it that I can think of is that it can save you memory, if you know your trues are concentrated in the beginning of the index array, to discard the trailing falses.
One option is to define your own sub-class of double (or whatever) and write a subsref method that throws the warning. Below is the beginnings of such a sub-class, with an illustration of its use.
>> x=myclass(1:10);
>> l=[ true(3,1) ; false(15,1)];
>> x(l)
Warning: Logical mask of untypical size
> In myclass>myclass.subsref at 25
ans =
1 2 3
classdef myclass<double
methods
function obj=myclass(data)
obj@double(data);
end
function out = subsref(obj,S)
dims=size(obj);
n=ndims(obj);
if n==2, dims(end)=[]; n=1; end
for ii=1:n
idx=S.subs{ii};
if islogical(idx) && numel(idx)~=dims(ii)
warning 'Logical mask of untypical size'
end
end
out = subsref@double(obj,S);
end
function display(obj)
display(double(obj))
end
end
end
  1 个评论
Walter Roberson
Walter Roberson 2013-1-10
I, of course, have taken advantage of the feature from time to time ;-) Saves having to calculate the padding with false() that I would have to add to make the number of elements the same.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Matrix Indexing 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by