skip to Main Content

Recently I’ve migrated primary keys from integer to bigint and found an article where the author manually updates the table statistics after updating PK data type:

-- reanalyze the table in 3 steps
SET default_statistics_target TO 1;
ANALYZE [table];
SET default_statistics_target TO 10;
ANALYZE [table];
SET default_statistics_target TO DEFAULT;
ANALYZE [table];

As far as I understand analyzer automatically runs in the background to keep statistics up to date. In Postgres docs (Notes section) I’ve found that the analyzer can be run manually after some major changes.

So the questions are:

  • Does it make sense to manually run the analyzer if autovacuum is enabled? And if yes, in which situations?
  • What are best practices with it (e.g. that default_statistics_target switching above)?

2

Answers


  1. Autoanalyze is triggered by data modifications: it normally runs when 10% of your table has changed (with INSERT, UPDATE or DELETE). If you rewrite a table or create an expression index, that does not trigger autoanalyze. In these cases, it is a good idea to run a manual ANALYZE.

    Don’t bother with this dance around default_statistics_target. A simple, single ANALYZE tab; will do the trick.

    Login or Signup to reply.
  2. Temporary tables cannot be processed by autovacuum. If you want them to have stats, you need to do it manually.

    Creating an expressional index creates the need for new stats, but doesn’t do anything to schedule an auto-ANALYZE to happen. So if you make one of those, you should do an ANALYZE manually. Table rewrites like the type change you linked to are the similar, they destroy the old stats for that column, but don’t schedule an auto-analyze to collect new ones. Eventually the table would probably be analyzed again just due to "natural" turnover, but that could be a very long time. (We really should do something about that, so they happen automatically)

    If some part of the table is both updated and queried much heavily than others, the cold bulk of the table can dilute out the activity counters driven by the hot part, meaning the stats for the rapidly changing part can grow very out of date before auto-analyze kicks in.

    Even if the activity counters for a bulk operation do drive an auto-analyze, there is no telling how long it might take to finish. This table might get stuck behind lots of other tables already being vacuumed or waiting to be. If you just analyze it yourself after a major change, you know it will get started promptly, and you will know when it has finished (when your prompt comes back) without needing to launch an investigation into the matter.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search