index is used with SELECT COUNT(1) but not with SELECT * - Mysql

neubert
April 21, 2023
340 views
0 votes
2 Answers

Here’s my query:

SELECT COUNT(1)
FROM nhd
WHERE gnis_id IN (01372336);

The table definition has an INDEX gnis_id (gnis_id) on it and, indeed, that index is used with this query. BUT when I replace the COUNT(1) with * the index is not used, at least according to EXPLAIN. Why is that? Why would changing the columns I’m getting back change how MySQL is deciding which INDEX to use for the WHERE part of the query?

Here’s the output of the EXPLAIN with *:

id: 1
select_type: SIMPLE
table: nhd
partitions: NULL
type: ALL
possible_keys: gnis_id
key: NULL
key_len: NULL
ref: NULL
rows: 2752840
filtered: 10.00
Extra: Using where

Here’s the output of SHOW INDEX FROM nhd:

https://pastebin.com/KXpqBeJm

I’m running MySQL 8.0.33

Answers

- blabla_bingo
- April 21, 2023 at 6:31 am
- 0 votes
0
From result of your SHOW INDEX statement, we can see the info regarding the table condition gnis_id:
```
Table: nhd
Non_unique: 1
Key_name: gnis_id
Seq_in_index: 1
Column_name: gnis_id
Collation: A
Cardinality: 16206
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
Index_comment:
Visible: YES
Expression: NULL
```
As the cardinality suggests, the selectivity for the index i.e the cardinality/table_row_count 16206 / 2752840 = 0.0059 is quite low. And by using select *, which has items beyond the stored values of the index, should MySQL decide to use the index, it would have to use the clustered key at the end of each index row to refer to the main table.
With the aboved mentioned two issues:low selectivity and has to use clustered index to refer to main table for missing information, MySQL’s optimiser believes it’d better use a FULL TABLE scan (which is based upon the clustered index) instead, as it find the latter one worse in the trade-off between the (FULL TABLE scan) and (extra reference work plus low selectivity).
Here is a couple of things you can try:

First, use select gnis_id instead of select * ,this will have MySQL use the index as the index covers the value of gnis_id itself, which means no extra reference work is necessary (the same applies to select count(1)). You will find the access type to be ref in the excution plan, which makes the query response time significantly (hopefully) better than the access type ALL which is caused by select *.

Secondly, use the select * clause but at the end of the from clause add force index(gnis_id) to force MySQL to use the index.
```
SELECT *
FROM nhd force index(gnis_id)
WHERE gnis_id IN (01372336);
```
Find how long it takes to execute it and compare how long it takes to execute the one without forcing the index. You will see why MySQL deems it unnecessary to use the index.
Login or Signup to reply.

- RickJames
- April 21, 2023 at 7:00 am
- 0 votes
0
A guess. Is this the declaration of the column?
```
gnis_id VARCHAR(8)
```
Meanwhile, you are checking those strings against a number:
```
WHERE gnis_id IN (01372336)
```
That means it has to convert each gnis_id to a number before testing.

Solution:
```
WHERE gnis_id IN ("01372336")
```
With that change, it will look at only 96 rows, not 2,758,717 rows. (Actually, it will be 97 rows — the extra one is to discover that it is finished.)

Discussion

SELECT COUNT(1) can be done entirely in the INDEX's BTree. That is the index is "covering"; EXPLAIN will say "Using index". And, because the index is ordered by gnis_id, the rows it needs to look at are consecutive once the datatypes match.

SELECT * needs to fetch all the columns, which are found only in the data’s BTree.

The clustered index on id does not need to be involved.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

index is used with SELECT COUNT(1) but not with SELECT * – Mysql

Answers