Taxi datset - lat lon colocation

4 次查看(过去 30 天)
LeoAiE
LeoAiE 2022-11-11
编辑: LeoAiE 2022-11-12
Hi everyone,
I’m practicing data science with NY Taxi data (https://www.kaggle.com/competitions/nyc-taxi-trip-duration/data). I want to check if certain taxis were collocated within a short distance from each other.
The task I’m trying to accomplish is:
  • Measure the distance for each taxi (against all other taxis)
  • If the taxis were within 20 m of each other – just an example
  • Check the date column associate with those lat lon matches
  • If the date match, meaning the two taxis visited the general area on the same date
  • Return the two unique IDs of those taxis, the date, the distance, and the lat lon of both of them
Here is my progress and I would really appreciate your help.
taxi = readtable('yellow_tripdata_2015-01.csv') % reading the data
taxi.uniqueID = randi([1000000,9999999],[height(taxi),1]) % Creating random hopefully unique ID :)
% measuring the distance from pick up to drop off
taxi.distance_travelled_in_M = deg2km(distance(taxi.pickup_latitude, taxi.pickup_longitude, taxi.dropoff_latitude, taxi.dropoff_longitude))
% spliting date and time in two diffrent columns
taxi.date = datestr(datetime(taxi.tpep_pickup_datetime),'dd/mm/yyyy')
taxi.time = datestr(datetime(taxi.tpep_pickup_datetime),'HH:MM:SS')
% In this loop I took each lat lon point and ran it agaist the rest of the
% data- probably not an efficient way - open to suggestions
% then converting the distance to meters
% checking if the distance within for example 20 meters
% and the date is the same - this step is not implemented yet , need help
% return the unique ids of both taxis, date of that event, the location
% (lat,lon), the distacne how far from each other
for idx = 1:length(taxi.pickup_latitude)
my_dist = distance(taxi.pickup_latitude(idx), taxi.pickup_longitude(idx), taxi.pickup_latitude, taxi.pickup_longitude); % distqances in degrees
my_dist = deg2km(my_dist); % distance in meters
if my_dist < 2
pos_results = [pos_results;my_dist;taxi(idx:idx, ["uniqueID", "date", "pickup_latitude", "pickup_longitude"])]
end
end
  2 个评论
Walter Roberson
Walter Roberson 2022-11-12
my_dist = distance(taxi.pickup_latitude(idx), taxi.pickup_longitude(idx), taxi.pickup_latitude, taxi.pickup_longitude); % distqances in degrees
Not really degrees. You are doing Euclidean distance calculations on a non-linear surface. The result is only meaningful if the data is recorded near the equator. New York City is 40.7128N and cosd() of that is about 0.76 so degrees longitude there are only roughly 3/4 of the distance of a degree latitude.
Have you considered calculating Great Circle Distance ?
LeoAiE
LeoAiE 2022-11-12
Thank you for your comments! I have not. I will research how to calculate great circle disctance!

请先登录,再进行评论。

回答(1 个)

Walter Roberson
Walter Roberson 2022-11-12
编辑:Walter Roberson 2022-11-12
Instead, do a rangesearch which is a knnsearch by distance.
Either do a custom distance calculation of Great Circle Distance, or else divide the longitudes by cosd() of the latitudes in order to adjust the two of them to be on the same scale, and then use Euclidean.
  3 个评论
Walter Roberson
Walter Roberson 2022-11-12
https://www.mathworks.com/help/map/ref/distance.html if you have the mapping toolbox
LeoAiE
LeoAiE 2022-11-12
编辑:LeoAiE 2022-11-12
Yes I do have it and thats why I use the distance function in my original post but I guess I have to specify somehting like this
s = referenceSphere('Earth')
distance(taxi.pickup_latitude(1), taxi.pickup_longitude(1), taxi.pickup_latitude(2), taxi.pickup_longitude(2),s)
The issue now how to check the date column associate with those lat lon matches and If the date match, meaning the two taxis visited the general area on the same date then return the two unique IDs of those taxis, the date, the distance, and the lat lon of both of them

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Earth, Ocean, and Atmospheric Sciences 的更多信息

产品


版本

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by