Geographical Binning and evaluation of results

4 次查看(过去 30 天)
Hello guys,
I drove though the city and measured. From this drive I extracted: time, lat, lot and my_var - the variable I was measuring.
1689×4 table
Var1 lat lot my_var
2021-04-12 10:50:02.000 51.18 14.431 20.68
2021-04-12 10:50:03.000 51.18 14.431 19.6
2021-04-12 10:50:05.000 51.18 14.431 19.7
2021-04-12 10:50:06.000 51.18 14.431 20.3
2021-04-12 10:50:07.000 51.18 14.431 20.21
We were performing those drives in lets say "normal traffic situation" so the speed we drove wary. This is the reason why I would like to create a geographical bins, 5x5m each seems to be right for my purposes.
I have used hista(lat,lon,0.0025) which gave me (I hope) right centers of the bins a counts how many measurements were taken in each.
%[latbin,lonbin,count] = hista(lat,lon,binarea)
[latbin, lonbin, count] = hista(lon,lat,0.0025);
results =[latbin, lonbin, count];
results =
14.4495 51.1668 2.0000
14.4500 51.1668 5.0000
14.4486 51.1673 2.0000
14.4491 51.1673 3.0000
14.4504 51.1673 4.0000
14.4467 51.1677 2.0000
14.4472 51.1677 2.0000
14.4477 51.1677 3.0000
14.4481 51.1677 2.0000
14.4504 51.1677 2.0000
Now, I would like to apply this on my measurements, but I got stuck. I think the right option for this should be DSEARCHN but I did notquite get it. I have to use it on lat and lon together.
Will appreciate any advice.
  3 个评论
Adam Danz
Adam Danz 2021-5-17
编辑:Adam Danz 2021-5-17
Also, according to the documentation,
"[latbin,lonbin,count] = hista(lat,lon,binarea) uses the bin size specified by the input binarea, which must be in square kilometers"
If you want 5x5 mile bins, wouldn't you need hista(lat,lon,12.9499) since 5 sq mi equals 12.9499 sq km?
Lastly, your comment "hista(lat,lon,0.0025)" correctly provides lat and lon in the first and second inputs but the code under your comment reverses those first two inputs, "[latbin, lonbin, count]=hista(lon,lat,0.0025);".
Karel Starý
Karel Starý 2021-5-18
Hello Adam,
thanks for reply and sorry for incomplete information on my part.
  1. I would like to make a median from all the measurements in one bin and use just this one new value as a reference.
  2. You are totally right with the bins, not really sure where i got the 0.0025
  3. The reverse of lat and lon was done by me during the preparation of this question.
They real question for me is how to connect each row of measurements to the correct bin. Because hista gave me just the bin centers. Ideally, I need code where inputs gonna be the lat and the lon from the table I have created and the output will point on the bin center from hista bins. Or something similar.
Thanks in advace!

请先登录,再进行评论。

采纳的回答

Adam Danz
Adam Danz 2021-5-18
编辑:Adam Danz 2021-5-18
Unfortunately hista()does not return a vector of bin numbers for each input coordinate which is hard to believe. According to the documentation, hista() outputs the bin centers so you just need to find which bin center each point is closest to. That's easily done in cartesian coordinates so I temporarily converted the (lon,lat) coordinate to equidistant (x,y) coordinates using grn2eqa (see Equal-Areas in Geographic Statistics) and pdist2 to compute distances between each coordinate and each bin center. The bin ID is then added to the table and the binned coordinates are plotted for verification.
Finally, the medians of each bin are computed.
This demo uses fake data and bins the coordinates at 2500 sq km.
% Produce some fake data
rng('default')
timestamp = datetime(2021,04,12,10,50,02) + seconds(cumsum(randi(5,1,100)))';
lat = rand(size(timestamp)) + 40;
lon = rand(size(timestamp))+10;
my_var = rand(size(timestamp))*5+15;
T = table(timestamp, lat, lon, my_var);
head(T) % view first few rows of the table
ans = 8×4 table
timestamp lat lon my_var ____________________ ______ ______ ______ 12-Apr-2021 10:50:07 40.162 10.644 15.298 12-Apr-2021 10:50:12 40.794 10.379 18.41 12-Apr-2021 10:50:13 40.311 10.812 15.212 12-Apr-2021 10:50:18 40.529 10.533 15.357 12-Apr-2021 10:50:22 40.166 10.351 17.608 12-Apr-2021 10:50:23 40.602 10.939 15.484 12-Apr-2021 10:50:25 40.263 10.876 19.091 12-Apr-2021 10:50:28 40.654 10.55 19.088
% bin at 2500 sq km
[latbin, lonbin, count] = hista(T.lat,T.lon,2500);
% Convert coordinates to equidistant cartesian coordinates
[T.latEq, T.lonEq] = grn2eqa(T.lat, T.lon);
[latbinEq, lonbinEq] = grn2eqa(latbin, lonbin);
% Compute distance between each coordinate and each bin-center
dist = pdist2([T.lonEq, T.latEq],[lonbinEq, latbinEq]);
% Add bin ID numbers to table
[~, T.bin] = min(dist,[],2);
head(T) % view updated table
ans = 8×7 table
timestamp lat lon my_var latEq lonEq bin ____________________ ______ ______ ______ _______ _______ ___ 12-Apr-2021 10:50:07 40.162 10.644 15.298 0.18578 0.64495 7 12-Apr-2021 10:50:12 40.794 10.379 18.41 0.18114 0.65335 6 12-Apr-2021 10:50:13 40.311 10.812 15.212 0.1887 0.64694 8 12-Apr-2021 10:50:18 40.529 10.533 15.357 0.18383 0.64983 8 12-Apr-2021 10:50:22 40.166 10.351 17.608 0.18065 0.645 4 12-Apr-2021 10:50:23 40.602 10.939 15.484 0.19092 0.6508 11 12-Apr-2021 10:50:25 40.263 10.876 19.091 0.18982 0.6463 11 12-Apr-2021 10:50:28 40.654 10.55 19.088 0.18413 0.65149 8
% Verify
figure()
hold on
scatter(lonbin, latbin, 200, 1:numel(lonbin), 'Marker','*','LineWidth',2) % bin centers
scatter(T.lon, T.lat, 50, T.bin,'filled')
cmap = colorcube(255);
colormap(cmap(1:end-10,:))
xlabel('longitude'); ylabel('latitude')
latbinUnq = unique(latbin);
lonbinUnq = unique(lonbin);
set(gca, 'xtick', lonbinUnq(2:end)-diff(lonbinUnq)/2, ...
'ytick',latbinUnq(2:end)-diff(latbinUnq)/2)
grid on
Compute medians of each bin
binnedMedian = splitapply(@median,T.my_var, T.bin);
T2 = table((1:numel(lonbin))', lonbin, latbin, binnedMedian, ...
'VariableNames', {'binID','LonBin','LatBin','Med_my_var'})
T2 = 12×4 table
binID LonBin LatBin Med_my_var _____ ______ ______ __________ 1 9.98 39.973 15.952 2 9.98 40.499 16.304 3 9.98 41.028 17.461 4 10.332 39.973 16.967 5 10.332 40.499 17.943 6 10.332 41.028 17.5 7 10.684 39.973 17.41 8 10.684 40.499 16.719 9 10.684 41.028 18.332 10 11.037 39.973 15.729 11 11.037 40.499 17.877 12 11.037 41.028 16.955
  2 个评论
Karel Starý
Karel Starý 2021-5-19
Words can’t possibly express my gratitude. I have lots of work before me, but this is the kick-start I needed. Thank you, dear sir & god bless!

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Descriptive Statistics and Visualization 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by