Closest Points between two datasets without using pdist2

Hi, i have two matrices A, of size mx2, and B, of size nx2.
Each row of both matrices identifies a point in 2D space. What i want to do is to write a code, that does not involve loops and pdist2, as fast as possible, that tells me the indices of the row of both A and B such that the distance squared of the two points is the minimum one.
Example:
A=[5 6;
1 2;
3 4
1 8];
B=[3 0;
2 1;
4 1;
3 5;
1 2];
My function must be like [indA,indB]=function(A_matrix,B_matrix)
I want as output [2,5]=function(A,B)
I found a solution using for-loops but i really would like to find a solution using repmat that involves vectorization.
Thanks

4 个评论

This appears to be a homework assignment.
Our policy here on MATLAB Answers is to offer only hints for such, not complete, working code.
It's not an homework assignment, it is an operation within a much bigger frame that has to be performed as fast as possible. The problem is of course simplified to make you better understand the core task that must be addressed. Thanks
repmat is slower than implicit expansion in many cases.
There are vectorized ways to get indices of the minimum, but they are not necessarily faster than using find() (would have to be tested) and would have problems with ties.
How could i use implicit expansion in this case? Furthermore, is it better to use a combination of find+min or to use sort twice (first by rows, then by columns) on the resultant matrix?

请先登录,再进行评论。

 采纳的回答

A=[5 6;
1 2;
3 4
1 8];
B=[3 0;
2 1;
4 1;
3 5;
1 2];
P = permute(sum((A-permute(B,[3 2 1])).^2,2),[1 3 2]) %will be size(A,1) by size(B,1)
P = 4×5
40 34 26 5 32 8 2 10 13 0 16 10 10 1 8 68 50 58 13 36
A_idx_in_B = sum(cumprod(P ~= min(P,[],1),1),1)+1
A_idx_in_B = 1×5
2 2 2 3 2
B_idx_in_A = sum(cumprod(P ~= min(P,[],2),2),2)+1
B_idx_in_A = 4×1
4 5 4 4
This code resolves ties in favor of the first match.

3 个评论

Given A_idx_in_A and B_idx_in_B how can i just get, respectively, 2 and 5? I just care about the indeces of the minimum distance (if i have ties it's not important which one i take).
Thanks for your time.
The idx_in variables are the indices of the minimum distance. For example the closest entry in B to A(4,:) is the 4th entry of A_idx_in_B which is 3, so A(4,:) is closest to B(3,:)

请先登录,再进行评论。

更多回答(1 个)

Came up with a solution:
[m,~] = size(A);
[n,~] = size(B);
A_rep = repmat(A,n,1);
B_rep = B';
B_rep = repmat(B_rep(:)',m,1);
dist = hypot( A_rep(:,1)-B_rep(:,1:2:end), A_rep(:,2)-B_rep(:,2:2:end) );
ind = find(dist == min(dist));
indB = floor((ind-1)./m)+1
indA = mod(ind-(indB-1)*m,n)

3 个评论

Is there a way to avoid the usage of find? In my solution i sorted the squared sums and i took the first element. As a result, the operation was way faster than find+min.
Not as far as I know, this is the method I've always used for cases like this. What is nice about this though is if there's a tie for the closest it will return all of those indicies of the tie.
I'll try some minor tweaks tomorrow starting from your excellent job. Thank you so much for your time!

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Surrogate Optimization 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by