I’ve built many queries on my PostgreSQL database, but some of them take a very long time to load, and I’m having trouble understanding why. Especially the one below, which can return a lot of information but uses very simple tables, so I don’t understand the execution time. Could you help me understand why it’s taking so long?
Thanks in advance!
query :
explain(analyze, verbose, buffers) SELECT prj_projet.prj_spkprojet,
prj_projet.prj_cintituleprojet,
prj_projet.prj_spkprojet AS "Id",
prj_projet.prj_cnumeroprojet,
prj_projet.prj_bprojetmodele,
prj_projet.prj_sfkcategorie,
intervenants.prj_intervenant_projet,
'[]'::jsonb AS prj_utilisateur,
prj_projet.prj_sfktype,
prj_projet.prj_sfketat,
prj_projet.prj_created_date AS prj_xcreated_date,
prj_projet.prj_last_modified_date AS prj_xlast_modified_date,
concat(utl1.utl_cnom, ' ', utl1.utl_cprenom) AS prj_created_by,
concat(utl2.utl_cnom, ' ', utl2.utl_cprenom) AS prj_last_modified_by,
grouprs.tmp_spkutilisateur,
grouprs.tmp_spkutilisateur IS NULL AS tmp_butilisateur,
doss.pjd_sfkdossier AS prj_sfkdossier,
(SELECT DISTINCT string_agg(encode(pg_largeobject_1.data, 'escape'::text), ''::text) AS string_agg
FROM baseteamberv20prj.prj_projet prj1
JOIN pg_largeobject pg_largeobject_1 ON prj1.prj_cphoto = pg_largeobject_1.loid
WHERE prj1.prj_spkprojet = prj_projet.prj_spkprojet) AS prj_cphoto,
prj_projet.prj_sdesactive,
prj_projet.prj_sfkinstance AS "Instance"
FROM baseteamberv20prj.prj_projet
LEFT JOIN (SELECT COALESCE(jsonb_agg(DISTINCT jsonb_build_object('itp_sfkintervenant', pi.int_sfkcontact,
'itp_sfkintervenant_contrat',
pi.int_spkintervenant, 'itp_sfktyperelation',
prj_intervenant_projet.itp_sfktyperelation,
'itp_sfktypeintervenant',
prj_intervenant_projet.itp_sfktypeintervenant))
FILTER (WHERE prj_intervenant_projet.itp_spkintervenantprojet IS NOT NULL),
'[]'::jsonb) AS prj_intervenant_projet,
prj_intervenant_projet.itp_sfkprojet
FROM baseteamberv20prj.prj_intervenant_projet
JOIN baseteamberv20prj.prj_intervenant pi
ON prj_intervenant_projet.itp_sfkintervenant = pi.int_spkintervenant
GROUP BY prj_intervenant_projet.itp_sfkprojet) intervenants
ON intervenants.itp_sfkprojet = prj_projet.prj_spkprojet
LEFT JOIN (SELECT COALESCE(jsonb_agg(ppud.pru_sfkutilisateur)
FILTER (WHERE ppud.pru_spkprojetutilisateur IS NOT NULL),
'[]'::jsonb) AS prj_utilisateur,
pj.prj_spkprojet
FROM baseteamberv20prj.prj_projet pj
LEFT JOIN baseteamberv20prj.prj_projet_utilisateur ppud
ON pj.prj_spkprojet = ppud.pru_sfkprojet AND ppud.pru_sfktyperelation 1
GROUP BY pj.prj_spkprojet) users ON users.prj_spkprojet = prj_projet.prj_spkprojet
LEFT JOIN (SELECT mat_vutilisateur_groupe_projet.tmp_spkutilisateur,
mat_vutilisateur_groupe_projet.prj_spkprojet
FROM baseteamberv20prj.mat_vutilisateur_groupe_projet) grouprs
ON grouprs.prj_spkprojet = prj_projet.prj_spkprojet
LEFT JOIN (SELECT prj_projet_dossier.pjd_sfkprojet,
prj_projet_dossier.pjd_sfkdossier
FROM baseteamberv20prj.prj_projet_dossier
LIMIT 1) doss ON doss.pjd_sfkprojet = prj_projet.prj_spkprojet
LEFT JOIN baseteamberv20utl.utl_utilisateur utl1
ON prj_projet.prj_created_by = utl1.utl_spkutilisateur AND prj_projet.prj_sdesactive = 0
LEFT JOIN baseteamberv20utl.utl_utilisateur utl2
ON prj_projet.prj_last_modified_by = utl2.utl_spkutilisateur AND prj_projet.prj_sdesactive = 0;
Plan :
Hash Right Join (cost=16173.72..5766871.09 rows=2224538 width=293) (actual time=1367.509..14853.652 rows=2224540 loops=1)
Hash Cond: (mat_vutilisateur_groupe_projet.prj_spkprojet = prj_projet.prj_spkprojet)
-> Seq Scan on mat_vutilisateur_groupe_projet (cost=0.00..36415.38 rows=2224538 width=16) (actual time=0.011..211.351 rows=2224538 loops=1)
-> Hash (cost=15980.52..15980.52 rows=15456 width=176) (actual time=1367.413..1367.424 rows=15456 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 14575kB
-> Hash Left Join (cost=13800.90..15980.52 rows=15456 width=176) (actual time=930.818..1349.884 rows=15456 loops=1)
Hash Cond: (prj_projet.prj_last_modified_by = utl2.utl_spkutilisateur)
Join Filter: (prj_projet.prj_sdesactive = 0)
-> Hash Left Join (cost=13791.25..15930.29 rows=15456 width=170) (actual time=930.725..1347.195 rows=15456 loops=1)
Hash Cond: (prj_projet.prj_created_by = utl1.utl_spkutilisateur)
Join Filter: (prj_projet.prj_sdesactive = 0)
-> Merge Left Join (cost=13781.59..15879.04 rows=15456 width=164) (actual time=930.604..1343.656 rows=15456 loops=1)
Merge Cond: (prj_projet.prj_spkprojet = doss.pjd_sfkprojet)
-> Merge Left Join (cost=13781.54..15840.34 rows=15456 width=156) (actual time=930.566..1341.224 rows=15456 loops=1)
Merge Cond: (prj_projet.prj_spkprojet = prj_intervenant_projet.itp_sfkprojet)
-> Index Scan using prj_projet_pkey on prj_projet (cost=0.29..726.42 rows=15456 width=124) (actual time=807.998..814.038 rows=15456 loops=1)
Filter: (prj_sdesactive = 0)
-> GroupAggregate (cost=13781.26..14734.72 rows=15136 width=40) (actual time=122.530..521.385 rows=15455 loops=1)
Group Key: prj_intervenant_projet.itp_sfkprojet
-> Sort (cost=13781.26..13972.32 rows=76426 width=48) (actual time=122.394..130.368 rows=76426 loops=1)
Sort Key: prj_intervenant_projet.itp_sfkprojet
Sort Method: quicksort Memory: 9269kB
-> Hash Join (cost=5562.54..7582.43 rows=76426 width=48) (actual time=57.770..93.160 rows=76426 loops=1)
Hash Cond: (prj_intervenant_projet.itp_sfkintervenant = pi.int_spkintervenant)
-> Seq Scan on prj_intervenant_projet (cost=0.00..1819.26 rows=76426 width=40) (actual time=0.016..6.338 rows=76426 loops=1)
-> Hash (cost=3727.24..3727.24 rows=146824 width=16) (actual time=57.156..57.157 rows=146824 loops=1)
Buckets: 262144 Batches: 1 Memory Usage: 8594kB
-> Seq Scan on prj_intervenant pi (cost=0.00..3727.24 rows=146824 width=16) (actual time=0.012..20.181 rows=146824 loops=1)
-> Sort (cost=0.05..0.05 rows=1 width=16) (actual time=0.027..0.029 rows=1 loops=1)
Sort Key: doss.pjd_sfkprojet
Sort Method: quicksort Memory: 25kB
-> Subquery Scan on doss (cost=0.00..0.04 rows=1 width=16) (actual time=0.023..0.024 rows=1 loops=1)
-> Limit (cost=0.00..0.03 rows=1 width=16) (actual time=0.019..0.020 rows=1 loops=1)
-> Seq Scan on prj_projet_dossier (cost=0.00..1.59 rows=59 width=16) (actual time=0.015..0.015 rows=1 loops=1)
-> Hash (cost=7.07..7.07 rows=207 width=22) (actual time=0.102..0.103 rows=207 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 20kB
-> Seq Scan on utl_utilisateur utl1 (cost=0.00..7.07 rows=207 width=22) (actual time=0.010..0.055 rows=207 loops=1)
-> Hash (cost=7.07..7.07 rows=207 width=22) (actual time=0.078..0.078 rows=207 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 20kB
-> Seq Scan on utl_utilisateur utl2 (cost=0.00..7.07 rows=207 width=22) (actual time=0.007..0.033 rows=207 loops=1)
SubPlan 1
-> Unique (cost=2.54..2.55 rows=1 width=32) (actual time=0.004..0.004 rows=1 loops=2224540)
-> Sort (cost=2.54..2.55 rows=1 width=32) (actual time=0.004..0.004 rows=1 loops=2224540)
Sort Key: (string_agg(encode(pg_largeobject_1.data, 'escape'::text), ''::text))
Sort Method: quicksort Memory: 25kB
-> Aggregate (cost=2.52..2.53 rows=1 width=32) (actual time=0.003..0.003 rows=1 loops=2224540)
-> Nested Loop (cost=0.29..2.52 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=2224540)
Join Filter: (prj1.prj_cphoto = pg_largeobject_1.loid)
-> Index Scan using prj_projet_pkey on prj_projet prj1 (cost=0.29..2.50 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=2224540)
Index Cond: (prj_spkprojet = prj_projet.prj_spkprojet)
-> Seq Scan on pg_largeobject pg_largeobject_1 (cost=0.00..0.00 rows=1 width=36) (actual time=0.000..0.000 rows=0 loops=2224540)
Planning Time: 1.221 ms
JIT:
Functions: 72
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 4.198 ms, Inlining 23.852 ms, Optimization 489.658 ms, Emission 294.622 ms, Total 812.330 ms
Execution Time: 14968.105 ms
and even more shocking is that query really simple but took too long too..
Query :
explain(analyze, verbose, buffers) SELECT prj_projet.prj_spkprojet,
utl_utilisateur_groupe.ugr_sfkutilisateur AS tmp_spkutilisateur,
utl_utilisateur_groupe.ugr_sfkutilisateur IS NULL AS tmp_butilisateur
FROM baseteamberv20prj.prj_projet
LEFT JOIN baseteamberv20prj.prj_projet_group ON prj_projet.prj_spkprojet = prj_projet_group.rpg_sfkprojet
LEFT JOIN baseteamberv20utl.utl_utilisateur_groupe
ON utl_utilisateur_groupe.ugr_sfkgroupe = prj_projet_group.rpg_sfkgroupe AND
(utl_utilisateur_groupe.ugr_sfkdroits = 56 OR utl_utilisateur_groupe.ugr_sfkdroits = 57)
JOIN baseteamberv20utl.utl_utilisateur
ON utl_utilisateur_groupe.ugr_sfkutilisateur = utl_utilisateur.utl_spkutilisateur AND
utl_utilisateur.utl_sdesactive = 0
GROUP BY prj_projet.prj_spkprojet, utl_utilisateur_groupe.ugr_sfkutilisateur;
Plan:
HashAggregate (cost=148677.79..180671.71 rows=3199392 width=17) (actual time=4746.171..5335.106 rows=2224538 loops=1)
Output: prj_projet.prj_spkprojet, utl_utilisateur_groupe.ugr_sfkutilisateur, (utl_utilisateur_groupe.ugr_sfkutilisateur IS NULL)
Group Key: prj_projet.prj_spkprojet, utl_utilisateur_groupe.ugr_sfkutilisateur
Batches: 1 Memory Usage: 204817kB
Buffers: shared hit=5485 read=308
-> Hash Join (cost=475.69..111410.49 rows=7453461 width=16) (actual time=26.440..2633.307 rows=5262632 loops=1)
Output: prj_projet.prj_spkprojet, utl_utilisateur_groupe.ugr_sfkutilisateur
Inner Unique: true
Hash Cond: (prj_projet_group.rpg_sfkprojet = prj_projet.prj_spkprojet)
Buffers: shared hit=5485 read=308
-> Nested Loop (cost=0.86..91364.97 rows=7453461 width=16) (actual time=21.612..1375.266 rows=5262632 loops=1)
Output: prj_projet_group.rpg_sfkprojet, utl_utilisateur_groupe.ugr_sfkutilisateur
Buffers: shared hit=5440 read=308
-> Nested Loop (cost=0.43..138.76 rows=1442 width=16) (actual time=21.555..28.826 rows=1560 loops=1)
Output: utl_utilisateur_groupe.ugr_sfkutilisateur, utl_utilisateur_groupe.ugr_sfkgroupe
Inner Unique: true
Buffers: shared hit=780
-> Index Scan using fk02_utilisateur_groupe_idx on baseteamberv20utl.utl_utilisateur_groupe (cost=0.28..67.44 rows=1442 width=16) (actual time=21.496..24.754 rows=1560 loops=1)
Output: utl_utilisateur_groupe.ugr_spkutilisateurgroupe, utl_utilisateur_groupe.ugr_sfkinstance, utl_utilisateur_groupe.ugr_sfkdroits, utl_utilisateur_groupe.ugr_sfkfonction, utl_utilisateur_groupe.ugr_sfkgroupe, utl_utilisateur_groupe.ugr_sfkutilisateur, utl_utilisateur_groupe.ugr_sdesactive, utl_utilisateur_groupe.created_by, utl_utilisateur_groupe.created_date, utl_utilisateur_groupe.last_modified_by, utl_utilisateur_groupe.last_modified_date
Filter: ((utl_utilisateur_groupe.ugr_sfkdroits = 56) OR (utl_utilisateur_groupe.ugr_sfkdroits = 57))
Rows Removed by Filter: 314
Buffers: shared hit=366
-> Memoize (cost=0.15..0.18 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1560)
Output: utl_utilisateur.utl_spkutilisateur
Cache Key: utl_utilisateur_groupe.ugr_sfkutilisateur
Cache Mode: logical
Hits: 1353 Misses: 207 Evictions: 0 Overflows: 0 Memory Usage: 23kB
Buffers: shared hit=414
-> Index Scan using utl_utilisateur_pkey on baseteamberv20utl.utl_utilisateur (cost=0.14..0.17 rows=1 width=8) (actual time=0.005..0.005 rows=1 loops=207)
Output: utl_utilisateur.utl_spkutilisateur
Index Cond: (utl_utilisateur.utl_spkutilisateur = utl_utilisateur_groupe.ugr_sfkutilisateur)
Filter: (utl_utilisateur.utl_sdesactive = 0)
Buffers: shared hit=414
-> Memoize (cost=0.43..87.29 rows=4693 width=16) (actual time=0.002..0.378 rows=3373 loops=1560)
Output: prj_projet_group.rpg_sfkprojet, prj_projet_group.rpg_sfkgroupe
Cache Key: utl_utilisateur_groupe.ugr_sfkgroupe
Cache Mode: logical
Hits: 1490 Misses: 70 Evictions: 0 Overflows: 0 Memory Usage: 16439kB
Buffers: shared hit=4660 read=308
-> Index Scan using fk02_projet_group_idx on baseteamberv20prj.prj_projet_group (cost=0.42..87.28 rows=4693 width=16) (actual time=0.043..2.152 rows=5008 loops=70)
Output: prj_projet_group.rpg_sfkprojet, prj_projet_group.rpg_sfkgroupe
Index Cond: (prj_projet_group.rpg_sfkgroupe = utl_utilisateur_groupe.ugr_sfkgroupe)
Buffers: shared hit=4660 read=308
-> Hash (cost=281.63..281.63 rows=15456 width=8) (actual time=4.797..4.799 rows=15456 loops=1)
Output: prj_projet.prj_spkprojet
Buckets: 16384 Batches: 1 Memory Usage: 732kB
Buffers: shared hit=45
-> Index Only Scan using prj_projet_pkey on baseteamberv20prj.prj_projet (cost=0.29..281.63 rows=15456 width=8) (actual time=0.019..1.995 rows=15456 loops=1)
Output: prj_projet.prj_spkprojet
Heap Fetches: 0
Buffers: shared hit=45
Query Identifier: -964312065641516586
Planning:
Buffers: shared hit=484
Planning Time: 3.152 ms
JIT:
Functions: 35
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 2.191 ms, Inlining 0.000 ms, Optimization 1.139 ms, Emission 20.393 ms, Total 23.723 ms
Execution Time: 5568.282 ms
Pour les DDL de projet :
-- auto-generated definition create table prj_projet ( prj_spkprojet bigserial primary key, prj_created_by bigint, prj_created_date timestamp default CURRENT_TIMESTAMP not null, prj_last_modified_by bigint, prj_last_modified_date timestamp, prj_bpourproduction boolean, prj_bprojetmodele boolean, prj_cdescription text, prj_cintitulelongprojet varchar(255), prj_cintituleprojet varchar(255) not null, prj_cnumeroprojet varchar(255), prj_cphoto oid, prj_davancement numeric(19, 2), prj_sdesactive integer default 0, prj_sfketat bigint, prj_sfktype bigint, prj_sfkinstance integer, prj_sfkcategorie bigint not null constraint fkqptq3ipqfp4p73y5dh3mto5ai references prj_projet_categorie, prj_sniddrupal varchar(255) ); alter table prj_projet owner to wandeed; create index fk01_projet_idx on prj_projet (prj_sfktype); create index fk02_projet_idx on prj_projet (prj_sfkcategorie);
sanchant qu’on ne voit pas ici (je ne sais pas pourquoi) l’index sur la clé primaire alors quelle est bien présente comme le montre la capture ci-dessous:
Pour les DDL de projet groupe :
-- auto-generated definition create table prj_projet_group ( rpg_spkprojetgroupe bigserial primary key, rpg_created_by bigint, rpg_created_date timestamp default CURRENT_TIMESTAMP not null, rpg_last_modified_by bigint, rpg_last_modified_date timestamp, rpg_sdesactive integer default 0, rpg_sfkgroupe bigint, rpg_sfkinstance integer, rpg_sfkrole bigint, rpg_sfktyperelation bigint not null, rpg_sfkprojet bigint not null constraint fkov3jcw2mnaovo28nfaf8166wq references prj_projet ); alter table prj_projet_group owner to wandeed; create index fk01_projet_group_idx on prj_projet_group (rpg_sfkprojet); create index fk02_projet_group_idx on prj_projet_group (rpg_sfkgroupe); create index fk03_projet_group_idx on prj_projet_group (rpg_sfktyperelation); create index prj_projet_group_rpg_sfkprojet_rpg_sfkgroupe_idx on prj_projet_group (rpg_sfkprojet, rpg_sfkgroupe) where ((rpg_sfktyperelation = 650) AND (rpg_sdesactive = 0));
Pour les DDL de utilisateur groupe :
create table utl_utilisateur_groupe ( ugr_spkutilisateurgroupe bigint default nextval('baseteamberv20utl.ugr_spkutilisateurgroupe_seq'::regclass) not null primary key, ugr_sfkinstance integer default 1 not null, ugr_sfkdroits bigint, ugr_sfkfonction bigint, ugr_sfkgroupe bigint, ugr_sfkutilisateur bigint constraint utl_utilisateur_groupe_fk references utl_utilisateur, ugr_sdesactive integer default 0 not null, created_by bigint, created_date timestamp, last_modified_by bigint, last_modified_date timestamp, constraint unique_user_group unique (ugr_sfkdroits, ugr_sfkgroupe, ugr_sfkutilisateur, ugr_sfkfonction) ); alter table utl_utilisateur_groupe owner to wandeed; create index fk01_utilisateur_groupe_idx on utl_utilisateur_groupe (ugr_sfkutilisateur); create index fk02_utilisateur_groupe_idx on utl_utilisateur_groupe (ugr_sfkgroupe); create index fk03_utilisateur_groupe_idx on utl_utilisateur_groupe (ugr_sfkdroits); create index groupe_utilisateur on utl_utilisateur_groupe (ugr_sfkgroupe, ugr_sfkutilisateur) where ((ugr_sdesactive = 0) AND ((ugr_sfkdroits = 56) OR (ugr_sfkdroits = 57)));
2
Answers
prj_projet.prj_spkprojet
is causing both slow times, check the index:In the first, most time is spent in this index scan
In the second, most time is spent aggregating with this same field as group key.
Check the index, and if that doesn’t lead to results, please come back with the DDL of the table/indexes.
(Looking only at the first query)
The correlated subquery is executed in retail, and doing any subquery over 2 million times is going to be slow, even if each execution is rather fast. This only explains about half the total execution time (0.004ms * 2224540 = 9s out of 15s) but the remaining half of the time is completely mysterious based on your plan. It could be due to the overhead of calling the clock function a huge number of times–to test that theory you could try EXPLAIN (ANALYZE, BUFFERS, TIMING OFF) and see if the total execution time drops.
To optimize, you could try calculating the subquery for every prj_spkprojet in bulk, and then join the rest of the query to that result:
Of course if pg_largeobject is always empty as it was in your demo, then this whole subquery is pointless in the first place.
Maybe the planner could be a bit smarter about doing this transformation for you, but as of now it is not.
I would also turn JIT off. The time (visibly) spent doing it is not large, but is almost surely wasted, and it introduces a lot of complexity (and potential for bugs) for little potential gain.