I have a large WordPress site with 170.000 users and a lot of daily page views.
I just tuned all MySQL indexes based on several comments but actually in my slow logs the SELECT distinct wp_usermeta.meta_key FROM wp_usermeta; takes around 3 seconds.
Server Hardware is: Dedicated Server with AMD Epyc 64 Cores, 128Gb DDR4, 2×480 NVMe SSD.
DB Server is MariaDB newest Version and config is (only innoDB tables):
innodb_buffer_pool_size = 64G
innodb_log_file_size = 16G
innodb_buffer_pool_instances = 16
innodb_io_capacity = 5000
max_binlog_size = 200M
max_connections = 250
wait_timeout = 28700
interactive_timeout = 28700
join_buffer_size = 128M
expire_logs_days = 3
skip-host-cache
skip-name-resolve
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
sql-mode = "STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION"
tmp_table_size = 256M
max_heap_table_size = 256M
table_definition_cache = 500
sort_buffer_size = 24M
key_buffer_size = 32M
performance_schema = on
Maybe someone has some suggestions
4
Answers
Suggestions to consider for your my.ini [mysqld] section to enable DEMAND query cache utilization.
We should probably SKYPE TALK later today before making any changes.
In a few hours (3-4), I will check in.
Of the 49 ‘values’ that are associated with each user, how many are used in a
WHERE
orORDER BY
? I suspect only a few.Here’s a way to work around WP’s abuse of the "Entity-Attribute-Value" design pattern.
Let’s say, a,b,c are useful for filtering and/or ordering. And the other 46 values are simply saved for displaying later. Have 4 rows, not 49 rows in usermeta for each user. 3 rows would be for a,b,c; the rest for a JSON string of the rest of the stuff.
Then have the application aware of the JSON and code accordingly.
This change would necessitate rebuilding
wp_usermeta
. 46 rows per user would be gathered together and rearranged into a single meta row with a moderately large JSON string (inmeta_value
). That might not shrink the table much, but it would make it faster to use.Observations:
The More Important Issues:
I do not see any items that seem critical to help with the problem you are having.
Details and other observations:
( innodb_lru_scan_depth ) = 1,536
— "InnoDB: page_cleaner: 1000ms intended loop took …" may be fixed by lowering lru_scan_depth
( innodb_io_capacity_max ) = 10,000
— When urgently flushing, use this many IOPs.— Reads could be slugghish or spiky.
( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) = 2,787,201 / 4145152 = 67.2%
— Pct of buffer_pool currently not in use— innodb_buffer_pool_size (now 68719476736) is bigger than necessary?
( Innodb_buffer_pool_bytes_data / innodb_buffer_pool_size ) = 22,248,669,184 / 65536M = 32.4%
— Percent of buffer pool taken up by data— A small percent may indicate that the buffer_pool is unnecessarily big.
( Innodb_log_writes ) = 5,298,275 / 96535 = 55 /sec
( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 96,535 / 60 * 16384M / 6560327680 = 4,213
— Minutes between InnoDB log rotations Beginning with 5.6.8, innodb_log_file_size can be changed dynamically; I don’t know about MariaDB. Be sure to also change my.cnf.— (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size (now 17179869184). (Cannot change in AWS.)
( Innodb_row_lock_waits ) = 83,931 / 96535 = 0.87 /sec
— How often there is a delay in getting a row lock.— May be caused by complex queries that could be optimized.
( Innodb_row_lock_waits/Innodb_rows_inserted ) = 83,931/1560067 = 5.4%
— Frequency of having to wait for a row.( innodb_flush_neighbors ) = 1
— A minor optimization when writing blocks to disk.— Use 0 for SSD drives; 1 for HDD.
( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF
— Whether to log all Deadlocks.— If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.
( join_buffer_size * Max_used_connections ) = (128M * 127) / 131072M = 12.4%
— (A metric for pondering the size of join_buffer_size.)— join_buffer_size (now 134217728) should probably be shrunk to avoid running out of RAM.
( (Com_show_create_table + Com_show_fields) / Questions ) = (66 + 1370563) / 37103211 = 3.7%
— Naughty framework — spending a lot of effort rediscovering the schema.— Complain to the 3rd party vendor.
( local_infile ) = local_infile = ON
— local_infile (now ON) = ON is a potential security issue
( Created_tmp_tables ) = 2,088,713 / 96535 = 22 /sec
— Frequency of creating "temp" tables as part of complex SELECTs.( Created_tmp_disk_tables ) = 1,751,146 / 96535 = 18 /sec
— Frequency of creating disk "temp" tables as part of complex SELECTs— increase tmp_table_size (now 268435456) and max_heap_table_size (now 268435456).
Check the rules for temp tables on when MEMORY is used instead of MyISAM. Perhaps minor schema or query changes can avoid MyISAM.
Better indexes and reformulation of queries are more likely to help.
( Created_tmp_disk_tables / Questions ) = 1,751,146 / 37103211 = 4.7%
— Pct of queries that needed on-disk tmp table.— Better indexes / No blobs / etc.
( Created_tmp_disk_tables / Created_tmp_tables ) = 1,751,146 / 2088713 = 83.8%
— Percent of temp tables that spilled to disk— Maybe increase tmp_table_size (now 268435456) and max_heap_table_size (now 268435456); improve indexes; avoid blobs, etc.
( Handler_read_rnd_next ) = 104,164,660,719 / 96535 = 1079035 /sec
— High if lots of table scans— possibly inadequate keys
( (Com_insert + Com_update + Com_delete + Com_replace) / Com_commit ) = (1561842 + 4652536 + 13886 + 42) / 352 = 17,694
— Statements per Commit (assuming all InnoDB)— High: long transactions strain various things.
( Com_insert + Com_delete + Com_delete_multi + Com_replace + Com_update + Com_update_multi ) = (1561842 + 13886 + 0 + 42 + 4652536 + 794) / 96535 = 65 /sec
— writes/sec— 50 writes/sec + log flushes will probably max out I/O write capacity of HDD drives
( ( Com_stmt_prepare - Com_stmt_close ) / ( Com_stmt_prepare + Com_stmt_close ) ) = ( 2208251 - 1415 ) / ( 2208251 + 1415 ) = 99.9%
— Are you closing your prepared statements?— Add Closes.
( Com_stmt_prepare - Com_stmt_close ) = 2,208,251 - 1415 = 2.21e+6
— How many prepared statements have not been closed.— CLOSE prepared statements
( Com_stmt_close / Com_stmt_prepare ) = 1,415 / 2208251 = 0.06%
— Prepared statements should be Closed.— Check whether all Prepared statements are "Closed".
( binlog_format ) = binlog_format = MIXED
— STATEMENT/ROW/MIXED.— ROW is preferred by 5.7 (10.3)
( Syncs ) = 5,727,396 / 96535 = 59 /sec
— Sync to disk for binlog.( Com_change_db ) = 1,168,504 / 96535 = 12 /sec
— Probably comes from USE statements.— Consider connecting with DB, using db.tbl syntax, eliminating spurious USE statements, etc.
( Connections ) = 3,377,949 / 96535 = 35 /sec
— Connections— Increase wait_timeout (now 28700); use pooling?
( thread_cache_size / Max_used_connections ) = 250 / 127 = 196.9%
— There is no advantage in having the thread cache bigger than your likely number of connections. Wasting space is the disadvantage.
( thread_pool_size ) = 64
— Number of ‘thread groups’. Limits how many treads can be executing at once. Probably should not be much bigger than the number of CPUs.— Don’t set much higher than the number of CPU cores.
You have the Query Cache half-off. You should set both query_cache_type = OFF and query_cache_size = 0 . There is (according to a rumor) a ‘bug’ in the QC code that leaves some code on unless you turn off both of those settings.
VSClasses.inc.256 Error with eval(‘((1048576 – 1031304) / 0) / 4096’) expr=[[((query_cache_size – Qcache_free_memory) / Qcache_queries_in_cache) / query_cache_min_res_unit]]
VSClasses.inc.256 Error with eval(‘(1048576 – 1031304) / 0 / 16384’) expr=[[(query_cache_size – Qcache_free_memory) / Qcache_queries_in_cache / query_alloc_block_size]]
VSClasses.inc.256 Error with eval(‘0/0’) expr=[[Innodb_dblwr_pages_written/Innodb_pages_written]]
VSClasses.inc.256 Error with eval(‘0 / (0 + 0 + 0)’) expr=[[Qcache_hits / (Qcache_hits + Qcache_inserts + Qcache_not_cached)]]
VSClasses.inc.256 Error with eval(‘0/0’) expr=[[Qcache_lowmem_prunes/Qcache_inserts]]
Abnormally small:
Abnormally large:
Abnormal strings:
Your server provisioning is fine. If anything, it’s overprovisioned for your site.
If I understand your comments correctly, you have many occurrences of this offending
SELECT distinct wp_usermeta.meta_key FROM wp_usermeta
query. And it seems like that query generates a 172K row result set. Yet you say each user has the entirely reasonable number of 49 rows in wp_usermeta.So, as @RickJames pointed out, it looks like each user somehow gets their own personal unique
wp_usermeta.meta_key
value. WordPress core doesn’t do that and would never do that. The point of thosewp_whatevermeta
tables is to have a limited number of keys. Also, rerunning that particular query very often is grossly inefficient. What conceivable purpose, other than some global list of users, does that query serve? So a plugin is surely implicated in this. If you can get the Query Monitor (https://wordpress.org/plugins/query-monitor/) plugin to work, it will tell you what software generated the offending query.There’s nothing magic about queries without
WHERE
clauses. SELECTs likeCOUNT(*)
andDISTINCT
generally need to scan through an entire table or index to generate their results, so when a table is large the query takes a long time. If you have an index onwp_usermeta.meta_key
, and it indexes the whole column not the 191 prefix, the offending query should do a relatively inexpensive loose index scan. But still, it has to do the index scan.And, try the plugin Rick James and I put together. https://wordpress.org/plugins/index-wp-mysql-for-speed/ It makes useful indexes, and also has a couple of diagnostic tools built in.