I want to calculate the distance between two neighboring data point. The following is my code:
from haversine import haversine
import pandas as pd
# Convert deviceTime column to datetime format
df['deviceTime'] = pd.to_datetime(df['deviceTime'])
# Calculate the Haversine distance between neighboring data points where date of previous row is same as date of next row and vid of previous row is same as vid of next row
df['distance (km)'] = df.groupby(['vehicleId', df['deviceTime'].dt.date])['lat', 'lon'].apply(lambda x: haversine(x.shift(), x)).fillna(0)
# Set distance (km) value for first row to 0
df.at[0, 'distance (km)'] = 0
print(df)
The code produces an error :
/home/ubuntu/snap/jupyter/common/lib/python3.7/site-packages/haversine/haversine.py in _ensure_lat_lon(lat, lon)
77 Ensure that the given latitude and longitude have proper values. An exception is raised if they are not.
78 """
---> 79 if lat < -90 or lat > 90:
80 raise ValueError(f"Latitude {lat} is out of range [-90, 90]")
81 if lon < -180 or lon > 180:
TypeError: '<' not supported between instances of 'str' and 'int'
example of data in the column lat and lon
I don’t why turns out as type error, is there anyone know how to resolve this?
2
Answers
I use another method to resolve it. Here is my code
import numpy as np import pandas as pd from haversine import haversine
However, i still can't figure out why the previous code not working. Welcome to post here if you get any clue on it
I think it could be that the shift fails if the groups are of length 1. Maybe you can drop these rows.