skip to Main Content

MySQL official documentation 8.8.5 Estimating Query Performance offers the way to count disk seeks:

log(row_count) / log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) + 1

In MySQL, an index block is usually 1,024 bytes and the data pointer is usually four bytes. For a 500,000-row table with a key value length of three bytes (the size of MEDIUMINT), the formula indicates log(500,000)/log(1024/3*2/(3+4)) + 1 = 4 seeks.

Innodb index is implemented by B-Tree, and I understand the formula. But there are two key points I cannot catch on in the sentence In MySQL, an index block is usually 1,024 bytes and the data pointer is usually four bytes.

My questions are:

  1. Innodb is index-organized table, so index block should equal to page
    block size which is default to 16k innodb_page_size. I googled
    index_block_length, but nothing found, so why it says usually 1,024 bytes here? Could someone list the documentation or source
    code which explains index_block_length?
  2. On 64bit version MySQL, is data pointer 8 bytes? Could someone list
    the documentation or source code which explains the length of data
    pointer ?

2

Answers


  1. First, please note that the calculation you are looking at is for MyISAM, not for InnoDB.

    The documentation has a description of the MyISAM index file format, including references to the source code.

    "index_block_length" refers to the block size:

    keydef->block_length         2    04 00                   that is, 1024
    

    It is specified by the system configuration variable myisam-block-size, defaulting to the mentioned 1024 bytes:

    The block size to be used for MyISAM index pages.

    The index entries look like this:

    (first key value)            2    01 31                   Value is "1" (0x31).
    (first key pointer)          4    00 00 00 00             Pointer is to Record
                                                              #0000.
    (second key value)           2    01 33                   Value is "3" (0x33).
    (second key pointer)         4    00 00 00 02             Pointer is to Record
                                                              #0002.
    

    where "data_pointer_length" is the length of a "key pointer". On disk, so it has has nothing to do with using 32- or 64-bit memory pointers.

    The size is set by the myisam_data_pointer_size configuration variable. You may notice that the default size is 6 nowaways, not 4. And with "nowaydays" I mean since MySQL 5.0.6, released in 2005. Which might give you an idea why the documentation you looked at didn’t mention they are not actually talking about InnoDB.

    Login or Signup to reply.
  2. I use the following guidelines (for InnoDB):

    • A production system with less data than innodb_buffer_pool_size will (after sufficient "warm up") have everything cached in RAM. Hence there will be no disk reads for SELECTs or the "read" part of write operations.
    • The "fanout" of indexes (PRIMARY KEY, or secondary indexes) can be [crudely] estimated at 100. That is, for each node in the B+Tree, there will be about 100 child nodes (or, in case of the bottom level, rows).
    • Even for very large datasets, I assume that the necessary index node, plus the non-leaf nodes of the data, will be cached. (This is actually a corollary of the "100" mentioned above — 99% for data; 1% for non-leaf nodes.)
    • When "counting the disk hits" for UUID-indexed rows or for otherwise "random" rows, 1 disk hit per row.
    • For "counting the disk hits" for consecutive rows, 1 disk hit per 100 rows.
    • For writes that need to change indexes, the Primary and Unique index blocks need to be fetched to check uniqueness. Non-unique updates are handled later (cf "change buffering"), so count as zero disk hits.
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search