How to choose 60 unique coordinates ('A') from 600 coordinates('B'), nearest to 60 other coordinates('C')

Question

moh mor 2024-10-21

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2161085-how-to-choose-60-unique-coordinates-a-from-600-coordinates-b-nearest-to-60-other-coordinates

编辑： John D'Errico 2024-10-21

Hello,

I have a question. Imagine we have an array 'C' containing 60*2 elements and another vector 'B' containing 600*2 elements. Each of these 2 arrays represent 'x' and 'y' coordinates of our points. I wish to choose 60 nearest coordinates from 'B' in a way that these coordinates( array 'A') be unique and nearest to coordinates in 'C'.

I want to do this with highest speed because I have to implement it multiple times. Definitely using 'for loop' would increase complexity so I want to avoid it.

Any help would be appreciated.

Thank you

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

John D'Errico 2024-10-21

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2161085-how-to-choose-60-unique-coordinates-a-from-600-coordinates-b-nearest-to-60-other-coordinates#answer_1534960

编辑：John D'Errico 2024-10-21

在 MATLAB Online 中打开

As I pointed out to @Voss and @Star Strider, if the points in B mst be unique, then there is an issue. If you simply choose the closest points (where it might be best to use knnsearch anyway) then one of the points in B may be simultaneously closest to different points in C.

The simple answer is to test to see if there are any replicate points chosen, where point k from B is simultaneously closest to two (or more!) points in C. But then you need to decide which other point to add to the set. I would probably do this by a resampling scheme, where all of the points already chosen are effectively removed from B. Now find a new point, which is the next closest to the point from C which was impacted. The code would not be too complex, and it would be reasonably efficient.

I'll give an example. Here, I'll use a smaller set of points to make thigns easier to visualize, and also to make it more likely there will be a conflict, I have made the size of B smaller. Larger sets will of course reduce the probability of a failure.

C = rand(10,2);
B = rand(15,2);
[idxB,D] = knnsearch(B,C,K=1)
idxB = 10×1
     6
     1
     1
    13
    12
     7
     1
     1
    13
    10
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
D = 10×1
    0.1258
    0.2244
    0.1678
    0.1289
    0.0267
    0.2911
    0.1258
    0.1877
    0.2728
    0.0600
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

In this set, you can see that point # 1 from B was simultaneously closest to 4 different points in C. And point 13 was also duplicated in that list. We could test to see if there were any replicates simply enough. For example,

if numel(unique(idxB)) < size(C,1)
    % stuff
end

But now there is an ambiguity. Point #1 from B was closest to point 7 in C. Do we now choose a second best point from B for points [2 3 8] in C? What if there is again an issue? Is there some optimal set which is "best" in some unspecified way? And this is where your problem specification fails, because it does not suggest how these issues must be addressed.

Will this be a frequent problem? SURE AS HELL YES! Consider this simulation.

failcount = 0;
for i = 1:100
    C = rand(60,2);
    B = rand(600,2);
    [idxB,D] = knnsearch(B,C,k=1);
    if numel(unique(idxB)) ~= size(C,1)
        failcount = failcount + 1;
    end
end
failcount
failcount =
99

So out of 100 random sets generated, 99 of them had a conflict. If uniqueness is an issue, then you WILL need to write code that worries about these conflicts.

Anyway, COULD you resolve all conflicts using a loop? Well, yes.

C = rand(60,2);
B = rand(600,2);
idxB = NaN(size(C,1),1);
Btemp = B;
for i = 1:size(C,1)
  idxB(i) = knnsearch(Btemp,C(i,:),k=1);
  Btemp(idxB(i),:) = inf;
end
numel(unique(idxB))
ans =
    60

And clearly this will result in a unique set, that has no conflicts.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 2

Voss 2024-10-21

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2161085-how-to-choose-60-unique-coordinates-a-from-600-coordinates-b-nearest-to-60-other-coordinates#answer_1534950

编辑：Voss 2024-10-21

在 MATLAB Online 中打开

C = rand(60,2);

B = rand(600,2);

[~,idx] = min(sum((permute(C,[1 3 2])-permute(B,[3 1 2])).^2,3),[],2);

A = B(idx,:);

figure

hold on

plot(B(:,1),B(:,2),'.b')

plot(C(:,1),C(:,2),'xr')

plot(A(:,1),A(:,2),'og')

A contains the nearest point in B to each point in C. This doesn't guarantee unique rows of A.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

John D'Errico 2024-10-21

And because it does not insure uniqueness, it fails to solve the problem posed.

请先登录，再进行评论。

Answer 3

Star Strider 2024-10-21

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2161085-how-to-choose-60-unique-coordinates-a-from-600-coordinates-b-nearest-to-60-other-coordinates#answer_1534955

在 MATLAB Online 中打开

Using pdist2 is an option —

C = rand(60,2);

B = rand(600,2);

[A,I] = pdist2(B,C,'euclidean','smallest',1)

A = 1×60

0.0344 0.0214 0.0052 0.0389 0.0349 0.0287 0.0120 0.0206 0.0079 0.0323 0.0089 0.0251 0.0155 0.0172 0.0109 0.0129 0.0429 0.0451 0.0286 0.0266 0.0215 0.0253 0.0099 0.0054 0.0148 0.0050 0.0092 0.0271 0.0256 0.0420

<mw-icon class=""></mw-icon>

I = 1×60

235 357 66 122 92 122 221 334 332 421 21 492 384 135 246 391 138 257 600 294 554 102 36 329 216 116 542 412 167 246

<mw-icon class=""></mw-icon>

figure

scatter(C(:,1), C(:,2), '.b', 'DisplayName','C')

hold on

scatter(B(:,1), B(:,2), '.g', 'DisplayName','B')

scatter(B(I,1), B(I,2), 'or', 'DisplayName','Closest ‘B’ To ‘C’')

hold off

axis('equal')

legend('Location','northoutside', 'Orientation','horizontal')

.

3 个评论
显示 1更早的评论隐藏 1更早的评论

Star Strider 2024-10-21

在 MATLAB Online 中打开

To ‘guarantee uniqueness’ implies to me looping through all the points in both matrices in order to determine if any points in ‘C’ are equidistant from more than one point in ‘B’. It would certainly be possible to do that, however it would require a loop and a significant amount of time.

Another option would be to return two distances for each ‘C’ point to determine if any distances are the same. If none of them are, that is likeley as close as you can get to ‘guarantee uniqueness’.

Asking the find function to return the indices of any columns where the distances are the same, in this instance returns an empty vector. So the top row of ‘I’ returns the indices of the unique rows of ‘C’ that are closest to elements of ‘B’, and the top row of ‘D’ returns their distances.

C = rand(60,2);

B = rand(600,2);

[D,I] = pdist2(B,C,'euclidean','smallest',2)

D = 2×60

0.0124 0.0167 0.0154 0.0294 0.0060 0.0190 0.0451 0.0099 0.0127 0.0140 0.0223 0.0459 0.0127 0.0660 0.0296 0.0069 0.0124 0.0197 0.0214 0.0162 0.0321 0.0236 0.0269 0.0350 0.0245 0.0314 0.0136 0.0346 0.0198 0.0091 0.0189 0.0474 0.0162 0.0391 0.0146 0.0250 0.0458 0.0200 0.0192 0.0424 0.0363 0.0471 0.0211 0.0674 0.0395 0.0154 0.0260 0.0423 0.0341 0.0265 0.0324 0.0253 0.0288 0.0477 0.0311 0.0448 0.0340 0.0355 0.0208 0.0172

<mw-icon class=""></mw-icon>

I = 2×60

435 300 56 464 198 49 568 380 41 568 552 70 172 312 415 593 233 213 242 242 233 142 506 371 427 96 71 369 3 480 463 94 466 117 597 325 297 141 189 311 120 320 268 508 55 13 490 298 531 531 490 127 389 568 435 229 60 258 441 347

<mw-icon class=""></mw-icon>

A = B(I(1,:),:);

% disp(A) % Optional

notunique = find(any(diff(D)==0,1))

notunique = 1x0 empty double row vector

Check = D(:,notunique)

Check = 2x0 empty double matrix

figure

scatter(C(:,1), C(:,2), '.b', 'DisplayName','C')

hold on

scatter(B(:,1), B(:,2), '.g', 'DisplayName','B')

scatter(A(:,1), A(:,2), 'or', 'DisplayName','‘A’ Closest ‘B’ To ‘C’')

hold off

axis('equal')

legend('Location','northoutside', 'Orientation','horizontal')

figure

stem3(A(:,1), A(:,2), D(1,:), Marker='.')

grid on

xlabel('A(:,1)')

ylabel('A(:,2)')

zlabel('D')

Quod erat demonstrandum!

.

John D'Errico 2024-10-21

编辑：John D'Errico 2024-10-21

在 MATLAB Online 中打开

Seems simple enough.

C = rand(60,2);
B = rand(600,2);
idxB = NaN(size(C,1),1);
Btemp = B;
for i = 1:size(C,1)
    idxB(i) = knnsearch(Btemp,C(i,:),k=1);
    Btemp(idxB(i),:) = inf;
end
numel(unique(idxB))
ans = 60

Just use a tool like pdist2 (or knnsearch) to find the closest element of B to each element of C, one at a time. Then zap out the element of B that you just found, effectively removing it from the search.

But since uniqueness was stated to be important, AND since almost always, a call to pdist2 or knnsearch will almost always experience at least a few conflicts, you need to do something.

请先登录，再进行评论。

How to choose 60 unique coordinates ('A') from 600 coordinates('B'), nearest to 60 other coordinates('C')

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（3 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

How to choose 60 unique coordinates ('A') from 600 coordinates('B'), nearest to 60 other coordinates('C')

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（3 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

3 个评论 显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论