We encountered a situation where CPU usage hit 100% due to a frequently running query. The issue resolved after performing a manual ANALYZE on the table involved in the query. However, I noticed the following:
Auto-analyze ran at 11:46 UTC (as confirmed by last_auto_analyze) and had run prior to that as well.
There were no DML operations on the table between 11:46 UTC and the manual ANALYZE at 12:02 UTC.
Despite the recent auto-analyze, CPU usage only dropped after the manual ANALYZE.
If auto-vacuum triggers auto-analyze, and it successfully ran at 11:46 UTC, why didn’t it resolve the high CPU usage? What might explain the difference in behavior between the auto-analyze and the manual ANALYZE in this scenario?
Additional Context:
The query in question is frequently executed.
The database version is 14.
Relevant settings: default settings.
Any insights into why manual ANALYZE worked while auto-analyze didn’t would be greatly appreciated.
We were expecting that auto vacuum will do the auto analyze job and it should update the statistics for the frequent query but CPU went down only after manual analyze.
2
Answers
Commands launched by autovacuum are throttled by default, to not let interference with user commands get out of hand.
The manual:
Manual
ANALYZE
(orVACUUM
) runs at full speed, so completes faster than autovacuum (with default settings) for big tables.Plus, after scanning tables (also customizable), autovacuum only actually launches
ANALYZE
after a certain number of written tuples, which is governed byautovacuum_analyze_threshold
andautovacuum_vacuum_scale_factor
.Still, unless your table is big,
ANALYZE
is not that expensive and should finish in timely manner. If your evidence is just the one case, then don’t draw premature conclusions. Column statistics are based on random samples, resulting estimates fluctuate along a normal distribution. WhileANALYZE
typically improves statistics, there is no strict guarantee.If there were really no data modifications, there are few possible explanations, since the effects of an automatic and a manual
ANALYZE
are identical.Possible explanations are:
The row sample that
ANALYZE
takes is randomized. It is possible that the good plan and the bad plan have almost the same cost, and the random sample can tip the scales this way or that.You can figure out if that is the case by manually running
ANALYZE
several times. If performance is sometimes bad afterANALYZE
and sometimes bad, you may have found the explanation.Perhaps you didn’t perform a DML operation, but a DDL operation like
CREATE INDEX
on an expression orCREATE STATISTICS
. That doesn’t trigger autoanalyze, but the new statistics after a manualANALYZE
can make a difference.Perhaps the table is partitioned. Autoanalyze doesn’t ever process partitioned tables (only the partitions), and the added information can make a difference to the planner.