Find equally (almost equal) distanced elements in an array

Question

Sardius 2022-8-16

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1780375-find-equally-almost-equal-distanced-elements-in-an-array

评论： Sardius 2022-8-19

Assume we have an array Y1 as follows:

Y1 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-0.278441922763728,0,0,0,0,0,0,0,0,0,0,0,0.339039014549722,0,0,0,0,0,-0.983321222184805,0,0,0,0,0,0,1.81696995276160,0,0,1.10627709436174,0,0,-0.754131732193239,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.272403139239999,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.254972819695545,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.250627796693075,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.234158547982346,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.212588567489227,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.191203969721771,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];

If you plot it, you can see that elements located at 127, 152, 178, 203, 229 are almost equally located in the array, with an almost equal distance of 25 between these elements. Given signals similar to Y1, how can we find such elements? Please note that the Y1 here is the simplest possible case, where only one set of equally distributed elements are given. In reality, there might be two or three sets of elements with different spacing and different locations, For example:

Y2 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-0.2784,0,0,0,0,0.5,0,0,0,0,0.35,0,0,0,0,-0.339039014549722,0,0,0,0,0.83321222184805,0,0,0,0,0,0,1.81696995276160,0,0,1.10627709436174,0,0,0,-0.754131732193239,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.272403139239999,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.254972819695545,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.250627796693075,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.234158547982346,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.212588567489227,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.191203969721771,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];

If you plot Y2, you can see that elements located at 20, 25, 30, 35, 40 are equally distributed with the distance of 5 in between these elements (Set 1), and elements located at 127, 152, 178, 203, 229 are almost equally located with an almost equal distance of 25 between these elements (Set 2).

To sum up, given Y1, it is desired to obtain [127, 152, 178, 203, 229], and given Y2, it is expected to obtain {[20, 25, 30, 35, 40] and [127, 152, 178, 203, 229]}.

Y1 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-0.278441922763728,0,0,0,0,0,0,0,0,0,0,0,0.339039014549722,0,0,0,0,0,-0.983321222184805,0,0,0,0,0,0,1.81696995276160,0,0,1.10627709436174,0,0,-0.754131732193239,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.272403139239999,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.254972819695545,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.250627796693075,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.234158547982346,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.212588567489227,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.191203969721771,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];
figure; plot(Y1); axis tight; grid;
ind1 = find(Y1)
20    32    38    45    48    51    76   127   152   178   203   229
diff(ind1)
12     6     7     3     3    25    51    25    26    25    26
Y2 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-0.2784,0,0,0,0,0.5,0,0,0,0,0.35,0,0,0,0,-0.339039014549722,0,0,0,0,0.83321222184805,0,0,0,0,0,0,1.81696995276160,0,0,1.10627709436174,0,0,0,-0.754131732193239,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.272403139239999,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.254972819695545,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.250627796693075,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.234158547982346,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.212588567489227,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.191203969721771,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];
figure; plot(Y2); axis tight; grid;
ind2 = find(Y2)
20    25    30    35    40    47    50    54    79   130   155   181   206   232
diff(ind2)
5     5     5     5     7     3     4    25    51    25    26    25    26

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

John D'Errico 2022-8-16

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1780375-find-equally-almost-equal-distanced-elements-in-an-array#answer_1027655

编辑：John D'Errico 2022-8-16

在 MATLAB Online 中打开

You don't really say enough about your question to make it easy to solve. How long will these vectors be? How many non-zeros? Do you know, in advance the strides that you want to search for? Or are you just looking for ANY (approximately) common strides? The latter is of course fairly difficult to do with any degree of intelligence. But you might be able to make something work using clustering in 1-dimension. For example...

Y1 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-0.278441922763728,0,0,0,0,0,0,0,0,0,0,0,0.339039014549722,0,0,0,0,0,-0.983321222184805,0,0,0,0,0,0,1.81696995276160,0,0,1.10627709436174,0,0,-0.754131732193239,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.272403139239999,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.254972819695545,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.250627796693075,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.234158547982346,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.212588567489227,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.191203969721771,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];

What matters are not what the elements are but WHERE they lie. So start by extracting them in a virtually sparse form.

ind1 = find(Y1)
ind1 = 1×12
    20    32    38    45    48    51    76   127   152   178   203   229

As you can see, this is a pretty small set. But now, what matters is how far apart are they? So compute the interpoint distances. A simple way to do this is to use pdist.

D1 = squareform(pdist(ind1'))
D1 = 12×12
  12    18    25    28    31    56   107   132   158   183   209
   0     6    13    16    19    44    95   120   146   171   197
   6     0     7    10    13    38    89   114   140   165   191
  13     7     0     3     6    31    82   107   133   158   184
  16    10     3     0     3    28    79   104   130   155   181
  19    13     6     3     0    25    76   101   127   152   178
  44    38    31    28    25     0    51    76   102   127   153
  95    89    82    79    76    51     0    25    51    76   102
 120   114   107   104   101    76    25     0    26    51    77
 146   140   133   130   127   102    51    26     0    25    51

That allows us to visualize the distance between any pair of non-zero elements. It also helps us to think about how we might look for clusters of points that lie at a common distance. Such clusters will lie near the main diagonal of the matrix D1, but not always perfectly so. Hmm. Anyway, suppose we decided to look for a set of equally distanced points, at roughly a distance of 25 as a stride? First, can we easily do that?

% find points that are at a stride of 25 +/- 3, converting the distance

% matrix into a graph.

G1 = graph(abs(D1 - 25) <= 3)

G1 =

graph with properties: Edges: [8×1 table] Nodes: [12×0 table]

Now, can we find a set of nodes that are connected?

conncomp(G1)
ans = 1×12
     1     2     3     1     1     1     1     4     4     4     4     4

This gives us a set of nodes that were connected in the graph. Do you see that the last 4 nodes were all connected, thus at a stride of 25+/- 3?

G1.Edges
ans = 8×1 table
    EndNodes
    ________

     1     4
     1     5
     5     7
     6     7
     8     9
     9    10
    10    11
    11    12

If you can see how to read that table, you would see that nodes 8, 9, 10, 11, and 12 all lie in that common stride.

Similarly, look at Y2.

Y2 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-0.2784,0,0,0,0,0.5,0,0,0,0,0.35,0,0,0,0,-0.339039014549722,0,0,0,0,0.83321222184805,0,0,0,0,0,0,1.81696995276160,0,0,1.10627709436174,0,0,-0.754131732193239,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.272403139239999,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.254972819695545,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.250627796693075,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.234158547982346,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.212588567489227,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.191203969721771,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];
ind2 = find(Y2);
D2 = squareform(pdist(ind2'))
D2 = 14×14
   5    10    15    20    27    30    33    58   109   134   160   185   211
   0     5    10    15    22    25    28    53   104   129   155   180   206
   5     0     5    10    17    20    23    48    99   124   150   175   201
  10     5     0     5    12    15    18    43    94   119   145   170   196
  15    10     5     0     7    10    13    38    89   114   140   165   191
  22    17    12     7     0     3     6    31    82   107   133   158   184
  25    20    15    10     3     0     3    28    79   104   130   155   181
  28    23    18    13     6     3     0    25    76   101   127   152   178
  53    48    43    38    31    28    25     0    51    76   102   127   153
 104    99    94    89    82    79    76    51     0    25    51    76   102

Again, we might use some tools to decide to look for the common cluster at a stride of 5.

G2 = graph(abs(D2 - 5) <= 1)

G2 =

graph with properties: Edges: [5×1 table] Nodes: [14×0 table]

conncomp(G2)

ans = 1×14

1 1 1 1 1 2 3 2 4 5 6 7 8 9

As you can see there, we have a cluster of the first 5 nodes, all with a stride of roughly 5.

G2 = graph(abs(D2 - 25) <= 1)

G2 =

graph with properties: Edges: [6×1 table] Nodes: [14×0 table]

conncomp(G2)

ans = 1×14

1 2 3 4 5 6 2 7 7 8 8 8 8 8

And again, a connected cluster of the ast 5 nodes, each roughly at a distance of 25.

With some effort, you might even be able to look for the common stride intelligently. Perhaps a way might exist to use clustering to search for those (NEARLY) common strides.

At the same time, be careful, since if your vector is terribly long, then things will go to hell, as that interpoint distance matrix will get larger. So don't expect this to work nearly as well if you have many thousands of nodes. Since there are now several questions left up in the air and unresolved, I'll stop here. Regardless, the graph theoretic tools In MATLAB will prove to be a terribly useful way to approach the problem.

2 个评论
显示无隐藏无

Sardius 2022-8-16

编辑：Sardius 2022-8-16

Dear John,

First, I would like to deeply appreciate for the time you spent and many thanks for the explanations. Regarding your questions, I have the answers as follows:

How long will these vectors be? These are ultrasonic signals of various lengths. But they are not longer than 1024 and not shorter than 256.

How many non-zeros? We don't know. Depends on each signal.

Do you know, in advance the strides that you want to search for? Definitely No, that is the key we are looking for (to be allocated as time of flight, TOF, of the signals).

I will go through your method tomorrow and then will get back to you. Meanwhile, since the code might need to be embedded on a microcontroller, the method needs to work pretty fast as well.

Thanks very much again.

John D'Errico 2022-8-16

Since the vectors will not be too long, the matrices will not grow immensely large. The sparser they are, the better of course. Would the above ideas be workable on a microcontroller? God only knows. Certainly not me. :) It would be worth testing on a larger problem.

But really, it looks like your main problem is one of automatically searching for a common stride. Once you have identified the stride, or at least a good candidate, then you have a graph-theoretic solution available.

One simple option would be to use brute force. That is, perform the above analysis, looking for groups of elements with a common stride of 2 +/- 1. Next, search for groups with a common stride of 3 +/- 1, continue the search, increasing the stride, until you find the set of maximal length. Could you do that? Of course. Whether it would be managable, I cannot say. It is very likely that with some thought a better solution exists.

请先登录，再进行评论。

Answer 2

John D'Errico 2022-8-16

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1780375-find-equally-almost-equal-distanced-elements-in-an-array#answer_1027715

编辑：John D'Errico 2022-8-16

在 MATLAB Online 中打开

As a subtly different answer, now that I understand your question a little better, consider this idea. My goal in this answer will be to see if an approximately common stride can be identified. The problem is, it need not be EXACT, so you would want to have a tolerance.

Y2 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-0.2784,0,0,0,0,0.5,0,0,0,0,0.35,0,0,0,0,-0.339039014549722,0,0,0,0,0.83321222184805,0,0,0,0,0,0,1.81696995276160,0,0,1.10627709436174,0,0,-0.754131732193239,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.272403139239999,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.254972819695545,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.250627796693075,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.234158547982346,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.212588567489227,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.191203969721771,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];
ind2 = find(Y2);

Now, assume we don't know what stride to search for. But it must happen between nodes that are close to each other.

conv(ind2,[1 -1],'valid')
ans = 1×13
     5     5     5     5     7     3     3    25    51    25    26    25    26
conv(ind2,[1 0 -1],'valid')
ans = 1×12
    10    10    10    12    10     6    28    76    76    51    51    51
conv(ind2,[1 0 0 -1],'valid')
ans = 1×11
    15    15    17    15    13    31    79   101   102    76    77
conv(ind2,[1 0 0 0 -1],'valid')
ans = 1×10
    20    22    20    18    38    82   104   127   127   102
conv(ind2,[1 0 0 0 0-1],'valid')
ans = 1×10
    20    22    20    18    38    82   104   127   127   102

Now, I might scan through those differences, looking for some common, or nearly common strides. If I did do that, I would notice that a stride of something like 5-7 might be of interest. Then a common stride of 10 seems to pop out, as do 15 and 25.

Next, do the connected component analysis for each candidate, as I showed in my other answer.

G2 = graph(squareform(abs(pdist(ind2')' - 6) <= 1));
connections = conncomp(G2)
connections = 1×14
     1     1     1     1     1     1     2     1     3     4     5     6     7     8
ind2(connections == 1) % 1 the number that appears the most frequently in connections 
ans = 1×7
    20    25    30    35    40    47    53

So we have found one sub-sequence of length 7.

I could now repeat the above analysis, looking for subsequences of length 10.

G2 = graph(squareform(abs(pdist(ind2')' - 10) <= 1));
connections = conncomp(G2)
connections = 1×14
     1     2     1     2     1     3     1     4     5     6     7     8     9    10
ind2(connections == 1) % again, 1 indicates the longest subsequence
ans = 1×4
    20    30    40    50

Next, consider subsequences of length approximately 25...

G2 = graph(squareform(abs(pdist(ind2')' - 25) <= 1));
connections = conncomp(G2)
connections = 1×14
     1     2     3     4     5     6     2     7     7     8     8     8     8     8
ind2(connections == 8)
ans = 1×5
   129   154   180   205   231

All of this was just a little playing around. You could automate something that works, but first, you need to find something that works.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Sardius 2022-8-19

在 MATLAB Online 中打开

Dear John,

Thank you very much for your answer. I read it carefully. I am trying to modify the following idea. Currently it is capable of detecting the longest interval atleast. The remaining issue is now in case of exact same length (width), it will show only the first one.

ind1 = find(Y2);
D = diff(ind1);
if ~isempty(D)
D = [D D(end)];
x = [];
for i = 1:length(D)-1
Kind = find(abs(D(i)-D(i+1))<=1);
if isempty(Kind)
    Kind = 0;
    x(1,i) = Kind;
end
  x(1,i) = Kind;
end
zpos = find(~[0 x 0]);
[~, grpidx] = max(diff(zpos));
key = zpos(grpidx);
TOF = abs(ind1(key)-ind1(key+1));
end

I will be working on it and will let you know in case of any better solution and/or progress towards the current approach. Let me know your comments please. Thanks very much again.

请先登录，再进行评论。

Find equally (almost equal) distanced elements in an array

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（2 个）

2 个评论
显示无隐藏无

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

Find equally (almost equal) distanced elements in an array

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（2 个）

2 个评论 显示 无隐藏 无

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无

1 个评论
显示 -1更早的评论隐藏 -1更早的评论