Postgresql - Postgres Select Row From Group With Opposite Sort Order

arik
May 8, 2024
76 views
2 votes
3 Answers

I have a Postgres table foobar with the following schema:

id: integer
group: integer
foo: integer
bar: integer
timestamp: integer

I keep track of updates to various groups, and those updates mutate the foo and bar properties. Every time I receive an update, I store the timestamp.

Here’s an example value I could have in the databse:

+------+---------+-------+-------+-------------+
| "id" | "group" | "foo" | "bar" | "timestamp" |
+------+---------+-------+-------+-------------+
| 1    | 1       | 10    | 20    | 1           |
| 2    | 1       | 11    | 19    | 2           |
| 3    | 1       | 10    | 20    | 3           |
| 4    | 1       | 10    | 20    | 4           |
+------+---------+-------+-------+-------------+

Oftentimes, the updates I receive are identical. A particularly critical piece of information I’m trying to extract is when I first received the combination of values that is current — but not for the first time ever, but rather the first update after which there haven’t been any changes.

A naïve approach would be the following query:

SELECT DISTINCT ON ("group", foo, bar) *
FROM foobar
ORDER BY "group", foo, bar, timestamp DESC;

However, that query would return the last row, which has the latest timestamp. If I switch timestamp to ASC, I would get the very first row, because I have seen the exact foo/bar value combination prior to the update at timestamp 2.

The intuitive thing would have been to simply move the timestamp DESC sort command prior to foo, but Postgres does not allow that. Something like MySQL’s HAVING operation could also have come in handy, but Postgres unfortunately doesn’t support that.

An incredibly inefficient approach I could take is programmatically iterate through each group, get the latest row, and then fetch all rows in descending timestamp order and stop as soon as I observe a change, but it seems that letting a database do this sort of operation would be wiser.

I am quite certain that I’m missing something obvious, but would greatly appreciate any help. Thanks!

Answers

- Stefanovsm
- May 8, 2024 at 12:56 pm
- 0 votes
0
You may resort to the old-school way of doing ‘distinct on’ selection using row_number window function:
```
select distinct on ("group", foo, bar) * from 
(
 select *, 
   row_number() over (partition by "group", foo, bar order by "timestamp") rnk
 from foobar
) t
where rnk > 1 -- ignore the initial set of values
order by "group", foo, bar, rnk; -- and pick the earliest record
```
Demo
Login or Signup to reply.

A way to solve your issue would be this:

select fb1."group", fb1."foo", fb1."bar"
from foobar fb1
left join foobar fb2
on fb1."group" = fb2."group" and
   fb2."timestamp" > fb1."timestamp" and
   (fb2."foo" <> fb1."foo" or fb2."foo" <> fb1."foo")
left join foobar fb3
on fb1."group" = fb3."group" and
   fb3."timestamp" < fb1."timestamp" and
   fb3."foo" = fb1."foo" and
   fb3."bar" = fb1."bar"
left join foobar fb4
on fb4."group" = fb1."group" and
   fb4."timestamp" < fb1."timestamp" and fb4."timestamp" > fb3."timestamp" and
   (fb4."foo" <> fb1."foo" or fb4."bar" <> fb1."bar")
where fb2."group" is null and (fb3."group" is null or fb4."group" is not null)
group by fb1."group", fb1."foo", fb1."bar"

Explanation: We are searching for the fb1 records for which there is no newer fb2 records with different foo and bar combination, nor older fb3 record with the same combination, or, if such an fb3 exists, then an in-between fb4 also exists with a different combination.

- Zegarek
- May 8, 2024 at 1:49 pm
- 0 votes
0
That’s a gaps-and-islands problem.
You can compare each row’s (foo,bar) to previous row’s using lag((foo,bar)). The window definition lets you only check those coming from the same group, in ascending order. Demo:
```
select id,"group",foo,bar,"timestamp"
from (select *,coalesce((foo,bar)<>lag((foo,bar))over w1,true) is_diff_from_prev
      from foobar
      window w1 as (partition by "group" order by "timestamp"))_
where is_diff_from_prev;
```
id group foo bar timestamp

1 1 10 20 1

2 1 11 19 2

3 1 10 20 3

Row 4 is missing because its (foo,bar) wasn’t different from 3‘s. The subquery is required because you can’t call the window function directly in a where.
Login or Signup to reply.

id	group	foo	bar	timestamp
1	1	10	20	1
2	1	11	19	2
3	1	10	20	3

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Postgres Select Row From Group With Opposite Sort Order

Answers