I’m really puzzled by the following query:
explain analyze
(with z as (
select 1 as attraction
,1 as nd_attraction
, n.fc
from (select pe_id,zo_id,co_id,value_num as dt
from rtpezocoav x --use view??
where pe_id = 405 and ad_id = 4
) dt
join calc_nodes n on n.co_id = dt.co_id and n.pe_id = 405 and n.se_id = 12
join calc_zones z on dt.zo_id = z.zo_id and z.pe_id = 405 and z.se_id = 12
left join calc_shopconcepts sh on sh.pe_id = 405 and sh.se_id = 12 and sh.sh_id = n.shop_concept and sh.ct_id = n.ct_id
left join rtsepezobrav com on com.se_id = 12 and com.pe_id = 405 and com.zo_id = z.zo_id and com.br_id = n.br_id and com.ad_id = 308 --MV: switched to table instead of view
join rtseperereav rere on rere.se_id = 12 and rere.pe_id = 405 and rere.re_id = z.re_id and rere.re2_id = n.re_id and rere.ad_id = 306 --MV: switched to table instead of view
)
select sum(attraction::float), sum(nd_attraction::float) from z)
When I run it with EXPLAIN ANALYZE
included, it finishes in < 0.5s and shows a clear parallel plan:
Finalize Aggregate (cost=455832.88..455832.89 rows=1 width=16) (actual time=397.581..397.581 rows=1 loops=1)
-> Gather (cost=455832.83..455832.84 rows=8 width=8) (actual time=397.355..399.729 rows=8 loops=1)
Workers Planned: 8
Workers Launched: 7
-> Partial Aggregate (cost=455832.83..455832.84 rows=1 width=8) (actual time=382.349..382.349 rows=1 loops=8)
-> Hash Join (cost=9839.72..454742.30 rows=436213 width=0) (actual time=21.295..355.953 rows=627375 loops=8)
Hash Cond: ((z.re_id = rere.re_id) AND (n.re_id = rere.re2_id))
-> Parallel Hash Join (cost=9835.10..451732.66 rows=436213 width=8) (actual time=21.079..286.712 rows=627375 loops=8)
Hash Cond: (x.zo_id = z.zo_id)
-> Parallel Hash Join (cost=4170.35..431738.52 rows=435194 width=12) (actual time=18.544..193.865 rows=627375 loops=8)
Hash Cond: (x.co_id = n.co_id)
-> Parallel Bitmap Heap Scan on rtpezocoav x (cost=1291.48..414863.71 rows=609897 width=8) (actual time=16.846..106.176 rows=627375 loops=8)
Recheck Cond: (pe_id = 405)
Rows Removed by Index Recheck: 104866
Filter: (ad_id = 4)
Heap Blocks: lossy=5558
-> Bitmap Index Scan on rtpezocoav_pe_id_idx (cost=0.00..71.69 rows=5315319 width=0) (actual time=3.754..3.754 rows=578560 loops=1)
Index Cond: (pe_id = 405)
-> Parallel Hash (cost=2843.07..2843.07 rows=2864 width=12) (actual time=1.624..1.624 rows=616 loops=8)
Buckets: 8192 Batches: 1 Memory Usage: 416kB
-> Hash Left Join (cost=126.56..2843.07 rows=2864 width=12) (actual time=0.572..1.348 rows=703 loops=7)
Hash Cond: ((n.shop_concept = (sh.sh_id)::double precision) AND (n.ct_id = sh.ct_id))
-> Parallel Bitmap Heap Scan on calc_nodes n (cost=122.19..2813.20 rows=2864 width=24) (actual time=0.439..0.789 rows=703 loops=7)
Recheck Cond: ((pe_id = 405) AND (se_id = 12))
Heap Blocks: exact=53
-> Bitmap Index Scan on idx_calc_nodes_sepe (cost=0.00..120.97 rows=4868 width=0) (actual time=0.580..0.580 rows=4924 loops=1)
Index Cond: ((pe_id = 405) AND (se_id = 12))
-> Hash (cost=4.33..4.33 rows=3 width=8) (actual time=0.057..0.057 rows=3 loops=7)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Only Scan using idx_calc_shopconcepts_sepebr on calc_shopconcepts sh (cost=0.27..4.33 rows=3 width=8) (actual time=0.042..0.043 rows=3 loops=7)
Index Cond: ((pe_id = 405) AND (se_id = 12))
Heap Fetches: 0
-> Parallel Hash (cost=5551.19..5551.19 rows=9085 width=8) (actual time=2.405..2.405 rows=2721 loops=8)
Buckets: 32768 Batches: 1 Memory Usage: 1184kB
-> Parallel Bitmap Heap Scan on calc_zones z (cost=539.91..5551.19 rows=9085 width=8) (actual time=0.657..2.484 rows=5442 loops=4)
Recheck Cond: ((pe_id = 405) AND (se_id = 12))
Heap Blocks: exact=407
-> Bitmap Index Scan on idx_calc_zones_sepe (cost=0.00..534.46 rows=21804 width=0) (actual time=2.351..2.351 rows=21768 loops=1)
Index Cond: ((pe_id = 405) AND (se_id = 12))
-> Hash (cost=4.48..4.48 rows=9 width=8) (actual time=0.080..0.080 rows=9 loops=8)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Only Scan using rtseperereav_unique on rtseperereav rere (cost=0.28..4.48 rows=9 width=8) (actual time=0.057..0.060 rows=9 loops=8)
Index Cond: ((se_id = 12) AND (pe_id = 405) AND (ad_id = 306))
Heap Fetches: 0
Planning Time: 3.801 ms
Execution Time: 399.879 ms
I also see 7 cores firing up briefly. However, when I run the query without EXPLAIN ANALYZE
, it takes 2.5s and only 1 core is used.
Auto-explain in my database logs shows the same query plan but:
Workers Planned: 8; Workers Launched: 0
What could cause this ?
Below are my settings. I basically did the following:
set max_parallel_workers_per_gather = 8;
set parallel_setup_cost = 0;--1000
set parallel_tuple_cost = 0;
ALTER TABLE rtpezocoav SET (parallel_workers = 8);
These are my settings:
allow_system_table_mods off
application_name DBEAVER - Mathias Versichele
archive_cleanup_command
archive_command pgbackrest --stanza=main archive-push %p
archive_mode on
archive_timeout 0
array_nulls on
authentication_timeout 1min
auto_explain.log_analyze on
auto_explain.log_buffers off
auto_explain.log_format text
auto_explain.log_level log
auto_explain.log_min_duration 2s
auto_explain.log_nested_statements on
auto_explain.log_settings off
auto_explain.log_timing on
auto_explain.log_triggers off
auto_explain.log_verbose off
auto_explain.sample_rate 1
autovacuum on
autovacuum_analyze_scale_factor 0.1
autovacuum_analyze_threshold 50
autovacuum_freeze_max_age 200000000
autovacuum_max_workers 3
autovacuum_multixact_freeze_max_age 400000000
autovacuum_naptime 1min
autovacuum_vacuum_cost_delay 2ms
autovacuum_vacuum_cost_limit -1
autovacuum_vacuum_scale_factor 0.2
autovacuum_vacuum_threshold 50
autovacuum_work_mem -1
backend_flush_after 0
backslash_quote safe_encoding
bgwriter_delay 200ms
bgwriter_flush_after 512kB
bgwriter_lru_maxpages 100
bgwriter_lru_multiplier 2
block_size 8192
bonjour off
bonjour_name
bytea_output hex
check_function_bodies on
checkpoint_completion_target 0.5
checkpoint_flush_after 256kB
checkpoint_timeout 5min
checkpoint_warning 30s
client_encoding UTF8
client_min_messages notice
cluster_name
commit_delay 0
commit_siblings 5
config_file /var/lib/pgsql/12/data/postgresql.conf
constraint_exclusion partition
cpu_index_tuple_cost 0.005
cpu_operator_cost 0.0025
cpu_tuple_cost 0.01
cursor_tuple_fraction 0.1
data_checksums off
data_directory /var/lib/pgsql/12/data
data_directory_mode 0700
data_sync_retry off
DateStyle ISO, MDY
db_user_namespace off
deadlock_timeout 1s
debug_assertions off
debug_pretty_print on
debug_print_parse off
debug_print_plan off
debug_print_rewritten off
default_statistics_target 100
default_table_access_method heap
default_tablespace
default_text_search_config pg_catalog.english
default_transaction_deferrable off
default_transaction_isolation read committed
default_transaction_read_only off
dynamic_library_path $libdir
dynamic_shared_memory_type posix
effective_cache_size 4GB
effective_io_concurrency 1
enable_bitmapscan on
enable_gathermerge on
enable_hashagg on
enable_hashjoin on
enable_indexonlyscan on
enable_indexscan on
enable_material on
enable_mergejoin on
enable_nestloop on
enable_parallel_append on
enable_parallel_hash on
enable_partition_pruning on
enable_partitionwise_aggregate off
enable_partitionwise_join off
enable_seqscan on
enable_sort on
enable_tidscan on
escape_string_warning on
event_source PostgreSQL
exit_on_error off
external_pid_file
extra_float_digits 3
force_parallel_mode off
from_collapse_limit 8
fsync on
full_page_writes on
geqo on
geqo_effort 5
geqo_generations 0
geqo_pool_size 0
geqo_seed 0
geqo_selection_bias 2
geqo_threshold 12
gin_fuzzy_search_limit 0
gin_pending_list_limit 4MB
hba_file /var/lib/pgsql/12/data/pg_hba.conf
hot_standby on
hot_standby_feedback off
huge_pages try
ident_file /var/lib/pgsql/12/data/pg_ident.conf
idle_in_transaction_session_timeout 0
ignore_checksum_failure off
ignore_system_indexes off
integer_datetimes on
IntervalStyle postgres
jit on
jit_above_cost 100000
jit_debugging_support off
jit_dump_bitcode off
jit_expressions on
jit_inline_above_cost 500000
jit_optimize_above_cost 500000
jit_profiling_support off
jit_provider llvmjit
jit_tuple_deforming on
join_collapse_limit 8
krb_caseins_users off
krb_server_keyfile FILE:/etc/sysconfig/pgsql/krb5.keytab
lc_collate en_US.UTF-8
lc_ctype en_US.UTF-8
lc_messages en_US.UTF-8
lc_monetary en_US.UTF-8
lc_numeric en_US.UTF-8
lc_time en_US.UTF-8
listen_addresses *
lo_compat_privileges off
local_preload_libraries
lock_timeout 0
log_autovacuum_min_duration -1
log_checkpoints off
log_connections off
log_destination stderr,csvlog
log_directory log
log_disconnections off
log_duration off
log_error_verbosity verbose
log_executor_stats off
log_file_mode 0600
log_filename postgresql-%a.log
log_hostname off
log_line_prefix
log_lock_waits off
log_min_duration_statement 1s
log_min_error_statement error
log_min_messages notice
log_parser_stats off
log_planner_stats off
log_replication_commands off
log_rotation_age 1d
log_rotation_size 0
log_statement none
log_statement_stats off
log_temp_files -1
log_timezone Europe/Berlin
log_transaction_sample_rate 0
log_truncate_on_rotation on
logging_collector on
maintenance_work_mem 2GB
max_connections 500
max_files_per_process 1000
max_function_args 100
max_identifier_length 63
max_index_keys 32
max_locks_per_transaction 1500
max_logical_replication_workers 4
max_parallel_maintenance_workers 4
max_parallel_workers 8
max_parallel_workers_per_gather 8
max_pred_locks_per_page 2
max_pred_locks_per_relation -2
max_pred_locks_per_transaction 64
max_prepared_transactions 0
max_replication_slots 10
max_stack_depth 2MB
max_standby_archive_delay 30s
max_standby_streaming_delay 30s
max_sync_workers_per_subscription 2
max_wal_senders 3
max_wal_size 3GB
max_worker_processes 8
min_parallel_index_scan_size 512kB
min_parallel_table_scan_size 8MB
min_wal_size 80MB
old_snapshot_threshold -1
operator_precedence_warning off
parallel_leader_participation on
parallel_setup_cost 0
parallel_tuple_cost 0
password_encryption md5
plan_cache_mode auto
plpgsql.check_asserts on
plpgsql.extra_errors none
plpgsql.extra_warnings none
plpgsql.print_strict_params off
plpgsql.variable_conflict error
port 5433
post_auth_delay 0
pre_auth_delay 0
primary_conninfo
primary_slot_name
promote_trigger_file
quote_all_identifiers off
random_page_cost 4
recovery_end_command
recovery_min_apply_delay 0
recovery_target
recovery_target_action pause
recovery_target_inclusive on
recovery_target_lsn
recovery_target_name
recovery_target_time
recovery_target_timeline latest
recovery_target_xid
restart_after_crash on
restore_command
row_security on
search_path beobank20_predict
segment_size 1GB
seq_page_cost 1
server_encoding UTF8
server_version 12.2
server_version_num 120002
session_preload_libraries
session_replication_role origin
shared_buffers 40GB
shared_memory_type mmap
shared_preload_libraries auto_explain
ssl off
ssl_ca_file
ssl_cert_file server.crt
ssl_ciphers HIGH:MEDIUM:+3DES:!aNULL
ssl_crl_file
ssl_dh_params_file
ssl_ecdh_curve prime256v1
ssl_key_file server.key
ssl_library OpenSSL
ssl_max_protocol_version
ssl_min_protocol_version TLSv1
ssl_passphrase_command
ssl_passphrase_command_supports_reload off
ssl_prefer_server_ciphers on
standard_conforming_strings on
statement_timeout 0
stats_temp_directory pg_stat_tmp
superuser_reserved_connections 3
synchronize_seqscans on
synchronous_commit on
synchronous_standby_names
syslog_facility local0
syslog_ident postgres
syslog_sequence_numbers on
syslog_split_messages on
tcp_keepalives_count 9
tcp_keepalives_idle 7200
tcp_keepalives_interval 75
tcp_user_timeout 0
temp_buffers 8MB
temp_file_limit -1
temp_tablespaces
TimeZone Europe/Brussels
timezone_abbreviations Default
trace_notify off
trace_recovery_messages log
trace_sort off
track_activities on
track_activity_query_size 16kB
track_commit_timestamp off
track_counts on
track_functions none
track_io_timing off
transaction_deferrable off
transaction_isolation read committed
transaction_read_only off
transform_null_equals off
unix_socket_directories /var/run/postgresql, /tmp
unix_socket_group
unix_socket_permissions 0777
update_process_title on
vacuum_cleanup_index_scale_factor 0.1
vacuum_cost_delay 0
vacuum_cost_limit 200
vacuum_cost_page_dirty 20
vacuum_cost_page_hit 1
vacuum_cost_page_miss 10
vacuum_defer_cleanup_age 0
vacuum_freeze_min_age 50000000
vacuum_freeze_table_age 150000000
vacuum_multixact_freeze_min_age 5000000
vacuum_multixact_freeze_table_age 150000000
wal_block_size 8192
wal_buffers 16MB
wal_compression off
wal_consistency_checking
wal_init_zero on
wal_keep_segments 0
wal_level replica
wal_log_hints off
wal_receiver_status_interval 10s
wal_receiver_timeout 1min
wal_recycle on
wal_retrieve_retry_interval 5s
wal_segment_size 16MB
wal_sender_timeout 1min
wal_sync_method fdatasync
wal_writer_delay 200ms
wal_writer_flush_after 1MB
work_mem 1280MB
xmlbinary base64
xmloption content
zero_damaged_pages off
The database is on Linux (centos-release-7-7.1908.0.el7.centos.x86_64).
2
Answers
So, DBeaver using JDBC driver was the issue here. From now on, I'll measure performance in psql. Thx @adrianklaver.
There are two possible explanations for 0 workers being launched when 8 were planned:
All available parallel workers (
max_parallel_workers
) are being used. This is the likely explanation given that even with the fast plan you are showing, only 7 out of 8 planned workers could be launched.You are running into one of the obstacles shown in the documentation.