Database aggregation is usually much faster than returning all the data to the application and calculating there. But for this partitioned table, a simple one-column AVG() query runs in 15 seconds, while returning all the rows to the application runs an order of magnitude faster (~1sec.) Both queries have identical WHERE clauses.
Bulk query:
SELECT * FROM Sales
WHERE company = 191
AND sale_date BETWEEN '10-01-2023' AND '10-31-2023'
Aggregate query:
SELECT AVG(volume) FROM Sales
WHERE company = 191
AND sale_date BETWEEN '10-01-2023' AND '10-31-2023'
Sales Table
CREATE TABLE Sales (
sale_date Timestamp(0) WITH TIME ZONE NOT NULL,
company INTEGER NOT NULL,
volume SMALLINT NOT NULL
) PARTITION BY RANGE(sent_on);
CREATE INDEX ON Sales USING BTREE(company,sale_date);
The DB is PostgreSql (running on AWS-Aurora)
The EXPLAIN output (execution plan) for the aggregate query is below. It shows the query being used, and everything but the final aggregation having a very low cost:
Aggregate (cost=109512.77..109512.78 rows=1 width=32)
Bitmap Heap Scan on sales_202310 sales (cost=835.01..109440.13 rows=29054 width=30)
Recheck Cond: ((company = 191) AND (sale_date >= '2023-10-01 00:00:00-04'::timestamp with time zone) AND (sale_date <= '2023-10-31 00:00:00-04'::timestamp with time zone))
-> Bitmap Index Scan on sales_202310_company_sale_date_idx (cost=0.00..827.75 rows=29054 width=0)
Index Cond: ((company = 191) AND (sale_date >= '2023-10-01 00:00:00-04'::timestamp with time zone) AND (sale_date <= '2023-10-31 00:00:00-04'::timestamp with time zone))
The execution plan for the raw query is identical (even the costs), except it lacks the first line (the aggregate).
NOTE: Once I run the aggregate query once for a given company (and thus I assume have the data buffered) re-running it again is as fast as the bulk query. The difference is only on the initial run.
2
Answers
I think you are deceived, and you are only seeing caching effects. If you run your “fast” query for a company whose data are not cached yet, it will be just as slow (likely slower) than the “slow” query.
The three ways to make this query faster are:
Cache all data by stuffing enough RAM into the machine and load the table and its indexes into cache with
pg_prewarm
.Cluster the table on the index:
Attention: this will lock and rewrite the table.
Create an index that can be used for an index-only scan:
The difference in actual time is probably caching, as described by Laurenz.
The difference in cost estimate is presumably because you are misreading the plan. The first number in the pair is the cost to return the first row, while the second is the cost to return all rows. The second is generally the one to look at for a query run to completion. For the aggregate, those numbers are the same since it only returns one row. For the scan step, and presumably for the top level of the non-aggregate query, they differ by over 100 fold (835.01..109440.13)