Postgresql - Avoid full table scan under window function run with use of join condition

vbarashko
September 27, 2022
195 views
3 votes
5 Answers

Given a database table events with columns: event_id, correlation_id, username, create_timestamp. It contains more than 1M records.

There is a problem I am trying to solve: for each event of a particular user display its latest sibling event. Sibling events are the events which has same correlation_id. The query I use for that is the following:

SELECT 
  "events"."event_id" AS "event_id", 
  "latest"."event_id" AS "latest_event_id" 
FROM 
  events "events" 
  JOIN (
    SELECT 
      "latest"."correlation_id" AS "correlation_id", 
      "latest"."event_id" AS "event_id", 
      ROW_NUMBER () OVER (
        PARTITION BY "latest"."correlation_id" 
        ORDER BY 
          "latest"."create_timestamp" ASC
      ) AS "rn" 
    FROM 
      events "latest"
  ) "latest" ON (
    "latest"."correlation_id" = "events"."correlation_id" 
    AND "latest"."rn" = 1
  ) 
WHERE 
  "events"."username" = 'user1'

It gets correct list of results but causes performance problems which must be fixed. Here is an execution plan of the query:

Hash Right Join  (cost=13538.03..15522.72 rows=1612 width=64)
  Hash Cond: (("latest".correlation_id)::text = ("events".correlation_id)::text)
  ->  Subquery Scan on "latest"  (cost=12031.35..13981.87 rows=300 width=70)
        Filter: ("latest".rn = 1)
        ->  WindowAgg  (cost=12031.35..13231.67 rows=60016 width=86)
              ->  Sort  (cost=12031.35..12181.39 rows=60016 width=78)
                    Sort Key: "latest_1".correlation_id, "latest_1".create_timestamp
                    ->  Seq Scan on events "latest_1"  (cost=0.00..7268.16 rows=60016 width=78)
  ->  Hash  (cost=1486.53..1486.53 rows=1612 width=70)
        ->  Index Scan using events_username on events "events" (cost=0.41..1486.53 rows=1612 width=70)
              Index Cond: ((username)::text = 'user1'::text)

From the plan, I can conclude that the performance problem mainly caused by calculation of latest events for ALL events in the table which takes ~80% of cost. Also it performs the calculations even if there are no events for a user at all. Ideally, I would like the query to do these steps which seem more efficient for me:

find all events by a user
for each event from Step 1, find all siblings, sort them and get the 1st

To simplify the discussion, let’s consider all required indexes as already created for needed columns. It doesn’t seem to me that the problem can be solved purely by index creation.

Any ideas what can be done to improve the performance? Probably, there are options to rewrite the query or adjust a configuration of the table.

Note that this question is significantly obfuscated in terms of business meaning to clearly demonstrate the technical problem I face.

Answers

One option that may improve efficiency is to rewrite the query filtering "rn" = 1 beforehand to reduce resulting rows when joining tables:

WITH "latestCte"("correlation_id", "event_id") as (SELECT 
  "correlation_id", 
  "event_id", 
  ROW_NUMBER () OVER (
    PARTITION BY "correlation_id" 
    ORDER BY 
      "create_timestamp" ASC
  ) AS "rn" 
FROM 
  events)
SELECT 
  "events"."event_id" AS "event_id", 
  "latest"."event_id" AS "latest_event_id" 
FROM 
  events "events" 
  JOIN (
    SELECT "correlation_id", "event_id" FROM "latestCte" WHERE "rn" = 1
  ) "latest" ON (
    "latest"."correlation_id" = "events"."correlation_id" 
  ) 
WHERE 
  "events"."username" = 'user1'

Hope it helps, also I am curious to see the resulting execution plan of this query. Best regards.

- Hambone
- September 27, 2022 at 5:09 pm
- 0 votes
0
Without having access to the data, I’m really just throwing out ideas…
1. Instead of a subquery, it’s worth trying a materialized CTE
2. Rather than the row_number analytic, you can try a distinct on. Honestly, I don’t predict any gains. It’s basically the same thing at the database level
Sample of both:
```
with latest as materialized (
    SELECT distinct on ("correlation_id")
      "correlation_id", "event_id" 
    FROM events
    order by
      "correlation_id", "create_timestamp" desc
)
SELECT 
  e."event_id", 
  l."event_id" AS "latest_event_id" 
FROM 
  events "events" e
  join latest l ON
    l."correlation_id" = e."correlation_id" 
WHERE 
  e."username" = 'user1'
```
Additional suggestion — if you are doing this over and over, I’d consider creating a temp table or materialized view for "latest," including an index by coorelation_id rather than re-running the subquery (or CTE) every single time. This will be a one-time pain followed my repeated gain.

Yet one more suggestion — if at all possible, drop the double quotes from your object names. Maybe it’s just me, but I find them brutal. Unless you have spaces, reserved words or mandatory uppercase in your field names (please don’t do that), then these create more problems than they solve. I kept them in the query I listed above, but it pained me.

And this last comment goes back to knowing your data… since row_number and distinct on are relatively expensive operations, it may make sense to make your subquery/cte more selective by introducing the "user1" constraint. This is completely untested, but something like this:
```
SELECT distinct on (e1.correlation_id)
  e1.correlation_id, e1.event_id
FROM events e1
join events e2 on
  e1.correlation_id = e2.correlation_id and
  e2.username = 'user1'
order by
  e1.correlation_id, e1.create_timestamp desc
```
Login or Signup to reply.

- LaurenzAlbe
- September 27, 2022 at 5:10 pm
- 0 votes
0
The window function has to scan the whole table. It has no idea that really you are only interested in the first value. A lateral join could perform better and is more readable anyway:
```
SELECT 
  e.event_id, 
  latest.latest_event_id
FROM 
  events AS e
  CROSS JOIN LATERAL
     (SELECT
        l.event_id AS latest_event_id
      FROM
        events AS l
      WHERE
        l.correlation_id = e.correlation_id 
      ORDER BY l.create_timestamp
      FETCH FIRST 1 ROWS ONLY
     ) AS latest
WHERE e.username = 'user1';
```
The perfect index to support that would be
```
CREATE INDEX ON event (correlation_id, create_timestamp);
```
Login or Signup to reply.

- jjanes
- September 27, 2022 at 5:18 pm
- 0 votes
0
All those needless double quotes are making my eyes bleed.

This should very fast with a lateral join, provided the number of returned rows is rather low, i.e. ‘user1’ is rather specific.
```
explain analyze SELECT 
  events.event_id AS event_id, 
  latest.event_id AS latest_event_id 
FROM 
  events "events" 
  cross JOIN lateral (
    SELECT 
      latest.event_id AS event_id 
      FROM events latest
      WHERE latest.correlation_id=events.correlation_id 
      ORDER by create_timestamp ASC limit 1
  ) latest 
WHERE 
  events.username = 'user1';
```
You will want an index on username, and one on (correlation_id, create_timestamp)

If the number of rows returned is large, then your current query, which precomputes in bulk, is probably better. But would be faster if you used DISTINCT ON rather than the window function to pull out just the latest for each correlation_id. Unfortunately the planner does not understand these queries to be equivalent, and so will not interconvert between them based on what it thinks will be faster.
Login or Signup to reply.

- deroby
- September 29, 2022 at 11:10 am
- 0 votes
0
Although I like the LATERAL JOIN approach suggested by others, when it comes to fetching just 1 field I’m 50/50 about using that and using a subquery as below.(If you need to fetch multiple fields using the same logic than by all means LATERAL is the way to go!)

I wonder if either would perform better, presumably they are executed in a very similar way by the SQL engine.
```
SELECT e.event_id, 
       (SELECT l.event_id
          FROM events AS l
         WHERE l.correlation_id = e.correlation_id 
         ORDER BY l.create_timestamp ASC -- shouldn't this be DESC?
         FETCH FIRST 1 ROWS ONLY) as latest_event_id
FROM events AS e
WHERE e.username = 'user1';
```
Note: You’re currently asking for the OLDEST correlated record. In your post you say you’re looking for the "latest sibling event". "Latest" IMHO implies the most recent one, so it would have the biggest create_timestamp, meaning you need to ORDER BY that field from high to low and then take the first one.

Edit: identical as suggested above, for this approach you also want the index on correlation_id and create_timestamp
```
CREATE INDEX ON event (correlation_id, create_timestamp);
```
You might even want to include the event_id to avoid a bookmark lookup although these pages are likely to be in cache anyway so not sure if it will really help all that much.
```
CREATE INDEX ON event (correlation_id, create_timestamp, event_id);
```
PS: same is true about adding correlation_id to your events_username index… but that’s all quite geared towards this (probably simplified) query and do keep in mind that more (and bigger) indexes will bring some overhead elsewhere even when they might bring big benefits elsewhere… it’s always a compromise.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Avoid full table scan under window function run with use of join condition

Answers