Same data and query. One finishes in seconds; the other times out - Amazon Web Sevices

tehc0w
April 3, 2023
209 views
0 votes
2 Answers

I’m using AWS RDS Postgres v13.7

I have a table, foo, of ~1m rows. I run a query that’s a select distinct on 2 columns in a CTE, a self join after, and finally a select count(*).

I clone the using create table bar as (select * from foo). I run the same query on this table.

The query using foo hangs and never finishes. The query using bar finishes in 10 seconds. I’m running/starting them side by side and tried running one before the other so it’s not a resource or order of query issue; foo never finishes, bar always finishes in a few seconds. There is a period of time in the afternoon when the foo query does also finish quickly so there may be a timing thing.

Literally the same data and query with different outcomes.

I’ve done a vacuum. Explain on the 2 tables and queries are identical.

When I try to decompose the CTE, the select distinct on foo or bar perform the same, as expected. It’s the self join on foo that hangs.

I’ve already read the post "Identical tables, identical query, completely different execution times" and Exactly same query and plan but different duration and total reads. I Know parameter sniffing but and I don’t believe they apply to my situation

My expectation is the foo query should run as fast as the bar query or they behave the same way

Answers

- draganstankovic
- April 20, 2023 at 2:03 am
- 0 votes
0
This looks like an issue with skewed stats for foo.
A query execution plan devised by Postgres relies on query semantics, indices available, server configuration, etc., BUT it also relies on table stats that postgres has gathered about your data (see: https://www.postgresql.org/docs/current/view-pg-stats.html). Ultimately, a query plan is devised based on heuristics.

I am guessing for your case it could be that foo might have been present for longer period of time and then its collected stats are different than the bar ones, and bar stats cause query planner to devise a better query plan which does not timeout.

One thing you can do is to try to create something like bar2 and confirm that one also works and see if you can compare query execution plans for foo, bar, and bar2. Likely foo would be different than either bar or bar2. Also, bar and bar2 should roughly be the same.

I hope this helps a bit.

EDIT: as pointed out by another answer – if the above proves to be true – you can do the: VACUUM ANALYZE foo; to refresh the stats.

Login or Signup to reply.

- PaulMaxwell
- April 20, 2023 at 2:46 am
- 0 votes
0
In PostgreSQL, VACUUM reclaims storage occupied by dead tuples and ANALYZE updates the statistics, or use VACUUM ANALYZE to reclaims storage and update table statistics.

e.g:
```
VACUUM ANALYZE foo;
-- now run your query
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Same data and query. One finishes in seconds; the other times out – Amazon Web Sevices

Answers