Is there a way to collapse ordered rows by terminal values with postgres window clause - Postgresql

Ethan
February 16, 2023
201 views
0 votes
2 Answers

I have a table foo:

some_fk	some_field	some_date_field
1	A	1990-01-01
1	B	1990-01-02
1	C	1990-03-01
1	X	1990-04-01
2	B	1990-01-01
2	B	1990-01-05
2	Z	1991-04-11
2	C	1992-01-01
2	B	1992-02-01
2	Y	1992-03-01
3	C	1990-01-01

some_field has 6 possible values: [A,B,C,X,Y,Z]
Where [A,B,C] signify opening or continuation events and [X,Y,Z] signify closing events. How do I get each span of time between the first opening event and closing event of each span, partitioned by some_fk, as shown in the table below:

some_fk	some_date_field_start	some_date_field_end
1	1990-01-01	1990-04-01
2	1990-01-01	1991-04-11
2	1992-01-01	1992-03-01
3	1990-01-01	NULL

*Note that a non-terminated time span ends with NULL

I do have a solution that involves 3 common table expressions, but I’m wondering if there is a (better/more elegant/canonical) way to do this in PostgreSQL without nested queries.

My approach was something like:

WITH ranked AS (
  SELECT
    RANK() OVER (PARTITION BY some_fk ORDER BY some_date_field) AS "rank"
    some_fk,
    some_field,
    some_date_field
  FROM foo    
), openers AS (
  SELECT * FROM ranked WHERE some_field IN ('A','B','C')
), closers AS (
  SELECT
    *,
    LAG("rank") OVER (PARTITION BY some_fk ORDER BY "rank") AS rank_lag
  FROM ranked WHERE some_field IN ('X','Y','Z')
)
SELECT DISTINCT
  openers.some_fk,
  FIRST_VALUE(openers.some_date_field) OVER (PARTITION BY some_fk ORDER BY "rank")
    AS some_date_field_start,
  closers.some_date_field AS some_date_field_end
FROM openers
JOIN closers
  ON openers.some_fk = closers.some_fk
WHERE openers.some_date_field BETWEEN COALESCE(closers.rank_lag, 0) AND closers.rank

… but I feel there must be a better way.

Thanks in advance for the help.

Answers

The basis of the query is to use LAG to determine if the previous record was a closure.

SELECT *,
LAG(some_field) OVER (PARTITION BY some_fk ORDER BY some_date_field) Previous_some_field
FROM foo

This allows you to filter on the correct 4 records from your expected results, with the first 2 columns included; your mistake was to put the WHERE clause onto that query directly, when what you want to do is use it as is in a sub-query and write the WHERE in the main query.
From that point, you have several possibilities to finish the query.

Here is a version using a scalar subquery:

SELECT some_fk, some_date_field AS some_date_field_start,
    (
        SELECT MIN(some_date_field)
        FROM foo
        WHERE some_fk = F.some_fk AND some_date_field > F.some_date_field AND some_field IN ('X','Y','Z')
    ) AS some_date_field_end
FROM (
    SELECT *,
    LAG(some_field) OVER (PARTITION BY some_fk ORDER BY some_date_field) Previous_some_field
    FROM foo
) F
WHERE some_field IN ('A','B','C')
AND COALESCE(previous_some_field,'Z') IN ('X','Y','Z')

Here is another version using a CROSS JOIN LATERAL:

SELECT some_fk, some_date_field AS some_date_field_start, some_date_field_end
FROM (
    SELECT *,
    LAG(some_field) OVER (PARTITION BY some_fk ORDER BY some_date_field) Previous_some_field
    FROM foo
) F1
CROSS JOIN LATERAL (
    SELECT MIN(some_date_field) AS some_date_field_end
    FROM foo
    WHERE some_fk = F1.some_fk AND some_date_field > F1.some_date_field AND some_field IN ('X','Y','Z')
) F2
WHERE some_field IN ('A','B','C')
AND COALESCE(previous_some_field,'Z') IN ('X','Y','Z')

- ChrisMaurer
- February 16, 2023 at 6:36 am
- 0 votes
0
Another approach is to create a grouping ID by creating a running sum of the closing events. Then in an outer SQL you can Group By and pick min() and max() dates.
```
Select some_fk,min(some_date) as some_date_field_start, max(some_date) as some_date_field _end
From (
    Select some_fk,some_date,
      Sum(Case When some_field in ('X','Y','Z') Then 1 Else 0 End)
        Over (Partition By some_fk Order By some_date
        Rows Between Unbounded Preceding And 1 Preceding)
      as some_grouping
    From foo
)
Group By some_fk,some_grouping
Order By some_fk,some_grouping
```
This seems a little simpler at least to me.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Is there a way to collapse ordered rows by terminal values with postgres window clause – Postgresql

Answers