I have big Postgres database(around 75 GB) and queries are very slow. Is there any way to make them faster?
About database:
List of relations
Schema | Name | Type | Owner | Persistence | Access method | Size | Description
--------+-------------------+----------+----------+-------------+---------------+------------+-------------
public | fingerprints | table | postgres | permanent | heap | 35 GB |
public | songs | table | postgres | permanent | heap | 26 MB |
public | songs_song_id_seq | sequence | postgres | permanent | | 8192 bytes |
d+ fingerprints
Table "public.fingerprints"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
---------------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
hash | bytea | | not null | | extended | | |
song_id | integer | | not null | | plain | | |
offset | integer | | not null | | plain | | |
date_created | timestamp without time zone | | not null | now() | plain | | |
date_modified | timestamp without time zone | | not null | now() | plain | | |
Indexes:
"ix_fingerprints_hash" hash (hash)
"uq_fingerprints" UNIQUE CONSTRAINT, btree (song_id, "offset", hash)
Foreign-key constraints:
"fk_fingerprints_song_id" FOREIGN KEY (song_id) REFERENCES songs(song_id) ON DELETE CASCADE
Access method: heap
d+ songs
Table "public.songs"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
---------------+-----------------------------+-----------+----------+----------------------------------------+----------+-------------+--------------+-------------
song_id | integer | | not null | nextval('songs_song_id_seq'::regclass) | plain | | |
song_name | character varying(250) | | not null | | extended | | |
fingerprinted | smallint | | | 0 | plain | | |
file_sha1 | bytea | | | | extended | | |
total_hashes | integer | | not null | 0 | plain | | |
date_created | timestamp without time zone | | not null | now() | plain | | |
date_modified | timestamp without time zone | | not null | now() | plain | | |
Indexes:
"pk_songs_song_id" PRIMARY KEY, btree (song_id)
Referenced by:
TABLE "fingerprints" CONSTRAINT "fk_fingerprints_song_id" FOREIGN KEY (song_id) REFERENCES songs(song_id) ON DELETE CASCADE
Access method: heap
No need to write to database, only read. All queries are very simple:
SELECT song_id
WHERE hash in fingerpints = X
EXPLAIN(analyze, buffers, format text) SELECT "song_id", "offset" FROM "fingerprints" WHERE "hash" = decode('eeafdd7ce9130f9697','hex');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using ix_fingerprints_hash on fingerprints (cost=0.00..288.28 rows=256 width=8) (actual time=0.553..234.257 rows=871 loops=1)
Index Cond: (hash = 'xeeafdd7ce9130f9697'::bytea)
Buffers: shared hit=118 read=749
Planning Time: 0.225 ms
Execution Time: 234.463 ms
(5 rows)
234 ms looks fine where it is one query. But in reality there 3000 query per time, that takes about 600 seconds. It is audio recognition application, so algoritm works like that.
About indexes:
CREATE INDEX "ix_fingerprints_hash" ON "fingerprints" USING hash ("hash");
For pooler I use Odyssey.
Little bit of info from config:
shared_buffers = 4GB
huge_pages = try
work_mem = 582kB
maintenance_work_mem = 2GB
effective_io_concurrency = 200
max_worker_processes = 24
max_parallel_workers_per_gather = 12
max_parallel_maintenance_workers = 4
max_parallel_workers = 24
wal_buffers = 16MB
checkpoint_completion_target = 0.9
max_wal_size = 16GB
min_wal_size = 4GB
random_page_cost = 1.1
effective_cache_size = 12GB
Info about hardware:
- Xeon 12 core (24 threads)
- RAM DDR4 16 GB ECC
- NVME disk
Will the database be accelerated by purchase more RAM to handle all DB inside (128 GB in example)? And what parameters should I change to say to Postgres to store db in ram?
I read about several topics about pg_tune, etc. but experiments don’t show any good results.
2
Answers
So the DB is read-only.
And in comments:
So indexes have been built up incrementally.
Indexes
UNIQUE CONSTRAINT
on(song_id, "offset", hash)
I would replace that with:
This enforces the same constraint, but the leading
hash
column in the underlying B-tree index now supports the filter onhash
in your displayed query. And the fact that all needed columns are included in the index, further allows much faster index-only scans. The (smaller) index should also be more easily cached than the (bigger) table (plus index).See:
Also rewrites the index in pristine condition, and with
FILLFACTOR
100 for the read-only DB. (Instead of the default 90 for a B-tree index.)Hash index on
(hash)
andCLUSTER
The name of the column "hash" has nothing to do with the name of the index type, which also happens to be "hash". (The column should probably not be named "hash" to begin with.)
If (and only if) you also have other queries centered around one of few
hash
values, that cannot use index-only scans (and you actually see faster queries than without) keep the hash index, additionally. And optimize it. (Else drop it!)An incrementally grown index may end up with bloat or unbalanced overflow pages in case of a hash index.
REINDEX
should take care of that. While being at it, increaseFILLFACTER
to 100 (from the default 75 for a hash index) for your read-only (!) DB. You canREINDEX
to make the change effective.Or you can
CLUSTER
(like jjanes already suggested) on the rearranged B-tree index from above:Rewrites the table and all indexes; rows are physically sorted according to the given index, so "clustered" around the leading column(s). Effects are permanent for your read-only DB. But index-only scans do not benefit from this.
When done optimizing, run once:
work_mem
The tiny setting for
work_mem
stands out:Even the (very conservative!) default is 4MB.
But after reading your question again, it would seem you only have tiny queries. So maybe that’s ok after all.
Else, with 16GB RAM you can typically afford a 100 times as much. Depends on your work load of course.
work_mem
(like 4MB?)Large amounts of temporary files written in your database over time, and mentions of "disk" in the output of
EXPLAIN ANALYZE
would indicate the need for morework_mem
.Additional questions
More RAM almost always helps until the whole DB can be cached in RAM and all processes can afford all the
work_mem
they desire.Everything that’s read from the database is cached automatically in system cache and Postgres cache, up to the limit of available RAM. (Setting
work_mem
too high competes for that same resource.)Increasing the RAM so that everything can stay in cache (perhaps after using pg_prewarm to get it into cache in the first place) would certainly work. But it is expensive and shouldn’t be necessary.
Having a hash index on something which is already a hashed value is probably not very helpful. Have you tried just a default (btree) index instead?
If you CLUSTER the table on the index over the column named "hash" (which you can only do if it is a btree index) then rows with the same hash code should mostly share the same table page, which would greatly cut down on the number of different buffer reads needed to fetch them all.
If you could get it do a bitmap heap scan instead of an index scan, then it should be able to have a large number of read requests outstanding at a time, due to effective_io_concurrency. But the planner does not account for effective_io_concurrency when doing planning, which means it won’t choose a bitmap heap scan specifically to get it that benefit. Normally an index read finding hundreds of rows on different pages would automatically choose a bitmap heap scan method, but in your case it is probably the low setting of random_page_cost which is inhibiting it from doing so. The low setting of random_page_cost is probably reasonable in itself, but it does have this unfortunate side effect. A problem with this strategy is that it doesn’t reduce the overall amount of IO needed, it just allows them overlap and so make better use of multiple IO channels. But if many sessions are running many instances of this query at the same time, they will start filling up those channels and so start competing with each other. So the CLUSTER method is probably superior as it gets the same answer with less IO. If you want to play around with bitmap scans, you could temporarily increase random_page_cost or temporarily set enable_indexscan to off.