Best match location data - two tables

5 次查看(过去 30 天)
Hi
I have location data (longitude and latitude) for certain sites and I want to compare these two with a national databse to know the local authority of these sites + their postcodes.
I have two tables:
Sitename Latitude Longitude
14182-Pembroke Refinery 51.686081 -5.0271217
8059-New Rugby 52.376283 -1.2860374
8037-Ketton 52.636537 -0.54737666
And
Postcode Local authority Longitude Latitude
CV21 2RY Rugby -1.288555 52.376856
TR26 2JQ Cornwall -5.490944 50.207428
SY10 7RN Shropshire -3.067703 52.917641
SA71 5SJ Pembrokeshire -5.02713 51.686093
PE9 3SX Rutland -0.5462 52.636828
By best matching both the latitude and longitude data from the first table with the second, the postcode and local authority will be known.
please note that the longitude and latitude are NOT exactly matched but will differ slightly hence I want the best match possible based on both info.
Please see the Excel sheet which contain all the data.
I appreciate your help with this.
Many thanks

采纳的回答

Image Analyst
Image Analyst 2021-6-10
I'm not one of those 3 talented individuals you asked for, but maybe you'll like my answer. See if it's what you want. If so, please accept the answer. If not, tell me what needs to be different.
%---------------------------------------------------------------------------------
% Create reference Look Up Table
% Postcode Local authority Longitude Latitude
% CV21 2RY Rugby -1.288555 52.376856
% TR26 2JQ Cornwall -5.490944 50.207428
% SY10 7RN Shropshire -3.067703 52.917641
% SA71 5SJ Pembrokeshire -5.02713 51.686093
% PE9 3SX Rutland -0.5462 52.636828
lookUpTable = table('Size', [5, 4], ...
'Variabletypes', {'string', 'string', 'double', 'double'}, ...
'VariableNames', {'Postcode', 'LocalAuthority', 'Longitude', 'Latitude'});
lookUpTable.Postcode = ["CV21 2RY"; "TR26 2JQ"; "SY10 7RN"; "SA71 5SJ"; "PE9 3SX"];
lookUpTable.LocalAuthority = ["Rugby"; "Cornwall"; "Shropshire"; "Pembrokeshire"; "Rutland"];
lookUpTable.Longitude = [-1.288555; -5.490944; -3.067703; -5.02713; -0.5462];
lookUpTable.Latitude = [52.376856; 50.207428; 52.917641; 51.686093; 52.636828]
%---------------------------------------------------------------------------------
% Create a table of unknown sites that we will try to find in the reference table.
% Sitename Latitude Longitude
% 14182-Pembroke Refinery 51.686081 -5.0271217
% 8059-New Rugby 52.376283 -1.2860374
% 8037-Ketton 52.636537 -0.54737666
tableOfUnknowns = table('Size', [3, 3], ...
'Variabletypes', {'string', 'double', 'double'}, ...
'VariableNames', {'Sitename', 'Longitude', 'Latitude'});
tableOfUnknowns.Sitename = ["14182-Pembroke Refinery"; "8059-New Rugby"; "8037-Ketton"];
tableOfUnknowns.Longitude = [-5.0271217; -1.2860374; -0.54737666];
tableOfUnknowns.Latitude = [51.686081; 52.376283; 52.636537]
%---------------------------------------------------------------------------------
% Run down the table of unknowns seeing if it's reasonably close
% to any row of the reference look up table.
rows = height(tableOfUnknowns);
for row = 1 : rows
fprintf('Processing row #%d of %d.\n', row, rows);
% Find the distance (radius) of this row to all the reference rows.
deltaLat = lookUpTable.Latitude - tableOfUnknowns.Latitude(row);
deltaLon = lookUpTable.Longitude - tableOfUnknowns.Longitude(row);
distances = sqrt(deltaLat .^ 2 + deltaLon .^ 2);
% Find the min distance
[minDistance, rowOfMin] = min(distances);
closestPostcode = lookUpTable.Postcode(rowOfMin);
closestLocalAuthority = lookUpTable.LocalAuthority(rowOfMin);
fprintf(' The closest Local Authority to %s is %s, which has Postcode %s.\n They are separated by %f degrees.\n', ...
tableOfUnknowns.Sitename(row), closestLocalAuthority, closestPostcode, minDistance);
end
You'll see
lookUpTable =
5×4 table
Postcode LocalAuthority Longitude Latitude
__________ _______________ _________ _________
"CV21 2RY" "Rugby" -1.288555 52.376856
"TR26 2JQ" "Cornwall" -5.490944 50.207428
"SY10 7RN" "Shropshire" -3.067703 52.917641
"SA71 5SJ" "Pembrokeshire" -5.02713 51.686093
"PE9 3SX" "Rutland" -0.5462 52.636828
tableOfUnknowns =
3×3 table
Sitename Longitude Latitude
_________________________ ___________ _________
"14182-Pembroke Refinery" -5.0271217 51.686081
"8059-New Rugby" -1.2860374 52.376283
"8037-Ketton" -0.54737666 52.636537
Processing row #1 of 3.
The closest Local Authority to 14182-Pembroke Refinery is Pembrokeshire, which has Postcode SA71 5SJ.
They are separated by 0.000015 degrees.
Processing row #2 of 3.
The closest Local Authority to 8059-New Rugby is Rugby, which has Postcode CV21 2RY.
They are separated by 0.002582 degrees.
Processing row #3 of 3.
The closest Local Authority to 8037-Ketton is Rutland, which has Postcode PE9 3SX.
They are separated by 0.001212 degrees.
  4 个评论
Adam Danz
Adam Danz 2021-6-10
编辑:Adam Danz 2021-6-10
My solution using pdist2 won't work with 1.7M rows.
However, it's technically incorrect to use euclidean distance (in ImageAnalyst's answer) with lat/lon which are spherical coordinates. The coordinates should be converted to Cartesian before computing euclidean distance or the distance should be computed in spherical units (Haversine formula, for example). However since you're using relative distance and not absolute distance you may not run into any problems unless a group of coordinates are close enough to each other to be affected by the Cartesian assumption.
Image Analyst
Image Analyst 2021-6-10
I agree with Adam. I know computing distance that way is not the most precise way (perhaps the Mapping Toolbox has a more accurate formula). However I think the distances are small enough (close together on the globe), and actually I assumed the main need was not to find the distances super accurately, but to find the closest Local Authority. Since the distances will likely not be close to one another, accuracy doesn't matter for simply finding the closest location. It will find the closest one regardless if the distances are somewhat inaccurate.

请先登录,再进行评论。

更多回答(1 个)

Adam Danz
Adam Danz 2021-6-10
I'd use pdist2 to compute the pair-wise distances between your list of coordinates and the list in the database to form a matrix "D" of distanaces (they will all be positive). For each row (or column, depending on how you enter the inputs to pdist2), you'll look for the index of the minimum using [~,minidx]=min(D) or min(D,2).
However, from what I understand of map coordinates, they are not linear so you'll need to convert them to cartesian coordinates using latlon2local i think. Alternatively, instead of using pdist2 you could use distance to compute distance on the sphere.
I do not have experience with either method other than using pdist2 with cartesian coordinates.
If you give it a shot and run into a specific problem I might be able to give further advice.

产品


版本

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by