created_date | rake_device | rounded_geo_lat | rounded_geo_lng | idle_or_moving | new_idle_moving |
---|---|---|---|---|---|
08-01-2024 11:27 | a-b | 28.29 | 76.39 | idle | idle |
08-01-2024 10:27 | a-b | 28.29 | 76.39 | idle | idle |
08-01-2024 09:27 | a-b | 28.29 | 76.39 | idle | idle |
08-01-2024 08:27 | a-b | 28.29 | 76.39 | idle | idle |
08-01-2024 07:27 | a-b | 28.29 | 76.39 | idle | idle |
08-01-2024 06:27 | a-b | 28.29 | 76.39 | moving | idle |
08-01-2024 05:27 | a-b | 28.28 | 76.39 | moving | idle |
08-01-2024 05:21 | a-b | 28.29 | 76.39 | idle | idle |
08-01-2024 05:12 | a-b | 28.29 | 76.39 | idle | idle |
08-01-2024 04:12 | a-b | 28.29 | 76.39 | idle | idle |
08-01-2024 03:12 | a-b | 28.29 | 76.39 | idle | idle |
08-01-2024 02:12 | a-b | 28.24 | 76.39 | moving | moving |
08-01-2024 03:13 | a-b | 28.24 | 76.39 | moving | moving |
08-01-2024 03:09 | a-b | 28.245 | 76.33129 | moving | moving |
08-01-2024 05:12 | a-b | 28.324 | 76.3921 | idle | idle |
08-01-2024 05:12 | a-b | 28.324 | 76.3921 | idle | idle |
Based on the above data, I want to derive new column new_idle_moving
. The logic for this column would be: I
I have to assess every pair value of rounded_geo_lat,rounded_geo_lng
. If the same pair to be found anywhere between that partition (rake_device, created_date desc) then all values coming between them should be mark as idle
and moving it should be as it is.
Note: we just have to replace those moving tags as idle if previous and after lat, lng for that moving tag is the same. AND DATA IS IN THE INCREMENTAL FORMAT SO EVERY NEW HOUR NEW ENTRY WILL BE GENERATED SO WE HAVE TO COMPARE THAT TO THE ENTIRE COLUMN OF LAT LNG.
I tried the following query, but it converts all values as idle which is wrong output.
WITH GroupedData AS (
SELECT
created_date,
rake_device,
rounded_geo_lat,
rounded_geo_lng,
idle_or_moving,
MAX(idle_or_moving) OVER (PARTITION BY rake_device, grp ORDER BY created_date DESC) AS max_idle_or_moving
FROM (
SELECT
created_date,
rake_device,
rounded_geo_lat,
rounded_geo_lng,
idle_or_moving,
SUM(change_flag) OVER (PARTITION BY rake_device ORDER BY created_date DESC) AS grp
FROM (
SELECT
created_date,
rake_device,
rounded_geo_lat,
rounded_geo_lng,
idle_or_moving,
CASE WHEN LAG(rounded_geo_lat) OVER (PARTITION BY rake_device ORDER BY created_date DESC) = rounded_geo_lat
AND LAG(rounded_geo_lng) OVER (PARTITION BY rake_device ORDER BY created_date DESC) = rounded_geo_lng
THEN 0 ELSE 1 END AS change_flag
FROM YourTable
) AS ChangeFlag
) AS GrpAssign
)
SELECT
gd.created_date,
gd.rake_device,
gd.rounded_geo_lat,
gd.rounded_geo_lng,
gd.idle_or_moving,
CASE WHEN gd.max_idle_or_moving = 'idle' THEN 'idle' ELSE gd.idle_or_moving END AS new_idle_moving
FROM GroupedData gd
ORDER BY gd.created_date DESC;
here is db fiddle link :https://dbfiddle.uk/w5pNGy5B
2
Answers
It is not clear how you want to condition the NEW_IDLE_MOVING column, but you can get the data from previous/next row using LAG() and LEAD() analytic functions. There are some other columns added in the answer that could help you create the right condition for change of ‘moving’ to ‘idle’. My final result has two new columns, one with same previous and next values (_AND), and another with either previous or next same values (_OR). It is not clear what you need since you mentioned same values as previous/next row and later affecting all between those having same values. Anyway, you’ll get the picture if you try to implement the sample below:
… see it here https://dbfiddle.uk/NsFrZsln
Note :
DATE_START – DATE_END are first and last occurence of the same LAT/LNG values
LAT_LNG/PREV_LAT_LNG/NEXT_LAT_LNG are concatenated LAT aand LNG values from current/previous/next row
In subbquery there are (just in case) values from previous/next row for both LAT and LNG values. This way you have in same row all the values from three windowing rows
See example
With your test data output is