skip to Main Content

I need to extract data from a postgres table, where I want to look at nearby records within a 30 seconds of timesstamp with two columns match and fetch the value which is having max value.

 id    |    datetime         | col_1 | col_2|dist|  dir
-------+---------------------+-------+------+----+------
 103   | 2023-10-06 20:12:19 |   135 |   13 | 57 | EAST
 213   | 2023-10-06 20:32:34 |   135 |   13 | 54 | NORTH
 103   | 2023-10-06 20:00:19 |   135 |   13 | 70 | EAST
 103   | 2023-10-06 20:42:34 |   135 |   13 | 54 | NORTH
 213   | 2023-10-06 20:12:19 |   135 |   13 | 57 | NORTH
 222   | 2023-10-06 18:42:34 |   135 |   13 | 45 | EAST
 222   | 2023-10-06 20:42:34 |   135 |   13 | 54 | EAST

First, 'id' and 'dir' column should be same then need to check the 'datetime' values with in 30 mins range and get all column values where 'dist' column value is max.

Output –

  id   |    datetime         | col_1 | col_2|dist|  dir
-------+---------------------+-------+------+----+------
 103   | 2023-10-06 20:00:19 |   135 |   13 | 70 | EAST
 213   | 2023-10-06 20:12:19 |   135 |   13 | 57 | NORTH
 103   | 2023-10-06 20:42:34 |   135 |   13 | 54 | NORTH
 222   | 2023-10-06 18:42:34 |   135 |   13 | 45 | EAST
 222   | 2023-10-06 20:42:34 |   135 |   13 | 54 | EAST

I am trying below query but didn’t work-

WITH cte AS ( SELECT id, datetime, col_1, col_2, dist, dir
              FROM table_1)
SELECT table_1.*
FROM table_1
JOIN cte ON table_1.id = cte.id AND table_1.dir = cte.dir
        AND table_1.datetime BETWEEN cte.datetime - INTERVAL '30 MINUTE'
                                      AND cte.datetime + INTERVAL '30 MINUTE'
ORDER BY id;

create DB queries –

CREATE TABLE table_1 
(
    id character varying NOT NULL, PRIMARY KEY(datetime, id),
    datetime timestamp without time zone NOT NULL,
    col_1 int,
    col_2 int,
    dist int,
    dir character varying(5)
);
INSERT INTO 
table_1(id, datetime, col_1, col_2, dist, dir)
VALUES
  ('103', '2023-10-06 20:12:19', 135, 130, 57, 'EAST'),
  ('213', '2023-10-06 20:32:34', 135, 130, 54, 'NORTH'),
  ('103', '2023-10-06 20:00:19', 135, 130, 70, 'EAST'),
  ('103', '2023-10-06 20:42:34', 135, 130, 54, 'NORTH'),
  ('213', '2023-10-06 20:12:19', 135, 130, 57, 'NORTH'),
  ('222', '2023-10-06 18:42:34', 135, 130, 45, 'EAST'),
  ('222', '2023-10-06 20:42:34', 135, 130, 54, 'EAST');

dbfiddle

2

Answers


  1. The following query returns the described result set:

    WITH
      cte AS (
        SELECT id, datetime, col_1, col_2, dist, dir,
               MAX(dist) OVER (PARTITION BY id, dir ORDER BY datetime
                               RANGE BETWEEN INTERVAL '30 minute' PRECEDING
                                         AND INTERVAL '30 minute' FOLLOWING) AS max_dist
          FROM table_1)
    SELECT id, datetime, col_1, col_2, dist, dir
      FROM cte
      WHERE dist = max_dist;
    

    The query partitions the data by id and dir and determines the maximum dist within a +/- 30 minute range of each data point and then filters for only those rows where dist is equal to the corresponding maximum.

    Login or Signup to reply.
  2. For PostgreSql 10 and earlier, you may use self join instead window function with time interval.
    For partition

       PARTITION BY id, dir 
        ORDER BY datetime
        RANGE BETWEEN INTERVAL '30 minute' PRECEDING
                  AND INTERVAL '30 minute' FOLLOWING
    

    use JOIN condition

    on t2.id=t1.id and t2.dir=t1.dir
       and t2.datetime between t1.datetime - interval '30 min'
             and t1.datetime + interval '30 min'
    

    See example

    with jt as(
    select t1.id,t1.datetime,t1.col_1,t1.col_2,t1.dist,t1.dir
      ,t2.id id2,t2.datetime datetime2,t2.col_1 col1_2,t2.col_2 col2_2,t2.dist dist2,t1.dir dir2
      ,max(t2.dist)over(partition by t1.id,t1.datetime,t1.dir) maxdist
      ,row_number()over(partition by t1.id,t1.datetime,t1.dir order by t2.dist)rnW
    from table_1 t1
    left join table_1 t2 on t2.id=t1.id and t2.dir=t1.dir
       and t2.datetime between t1.datetime - interval '30 min'
             and t1.datetime + interval '30 min'
    )
    select id, datetime, col_1, col_2, dist, dir
    from jt
    where dist=maxdist and rnW=1
    

    Fiddle

    Before filter we have

    id datetime col_1 col_2 dist dir datetime2 dist2 maxdist rnw
    >103 2023-10-06 20:00:19 135 13 70 EAST 2023-10-06 20:12:19 57 70 1
    103 2023-10-06 20:00:19 135 13 70 EAST 2023-10-06 20:00:19 70 70 2
    103 2023-10-06 20:12:19 135 13 57 EAST 2023-10-06 20:12:19 57 70 1
    103 2023-10-06 20:12:19 135 13 57 EAST 2023-10-06 20:00:19 70 70 2
    >103 2023-10-06 20:42:34 135 13 54 NORTH 2023-10-06 20:42:34 54 54 1
    >213 2023-10-06 20:12:19 135 13 57 NORTH 2023-10-06 20:32:34 54 57 1
    213 2023-10-06 20:12:19 135 13 57 NORTH 2023-10-06 20:12:19 57 57 2
    213 2023-10-06 20:32:34 135 13 54 NORTH 2023-10-06 20:32:34 54 57 1
    213 2023-10-06 20:32:34 135 13 54 NORTH 2023-10-06 20:12:19 57 57 2
    >222 2023-10-06 18:42:34 135 13 45 EAST 2023-10-06 18:42:34 45 45 1
    >222 2023-10-06 20:42:34 135 13 54 EAST 2023-10-06 20:42:34 54 54 1
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search