Mysql - IS NOT NULL date in sql or compare with raw data

aminesmaili
July 24, 2024
217 views
0 votes
2 Answers

I want to ask a question,
imagine i have this query:

SELECT COUNT(DISTINCT (`patient_id`)) AS `num`
FROM `visits` AS `Visit`
         INNER JOIN `patients` AS `patient` ON `Visit`.`patient_id` = `patient`.`id`
WHERE `Visit`.`doctor_id` = 21293
  AND `Visit`.`office_id` = 14712
  AND `Visit`.`started_at` IS NOT NULL
LIMIT 1;

now the query is slow and the started_at is indexed.

if i do this instead :

SELECT COUNT(DISTINCT (`patient_id`)) AS `num`
FROM `visits` AS `Visit`
         INNER JOIN `patients` AS `patient` ON `Visit`.`patient_id` = `patient`.`id`
WHERE `Visit`.`doctor_id` = 21293
  AND `Visit`.`office_id` = 14712
  AND `Visit`.`started_at` > '2000-01-01'
LIMIT 1;

and use comparison for specifying the nullability of the column, it is getting faster. now i wonder if this is the case for all of my queries and i can use that tactic for all of them. is it faster in all cases?

I didnt try anything im just asking if they know it for a fact

Answers

- AjazAbdulAziz
- July 21, 2024 at 3:40 pm
- 0 votes
0
Data volume plays a major part in discussing about the efficiency of both these queries. The second query seems more efficient due to its specific date filter. But, the actual performance difference can be due to other factors as well, like indexing, data, and database optimization strategies.

Login or Signup to reply.

- LajosArpad
- July 24, 2024 at 3:10 pm
- 0 votes
0
It largely depends on your table configuration and your index. If you have a simple index on started_at, then it creates a data-structure under the hood that optimizes your searches upon the indexed field-set. Speed largely depends on how that data-structure is being implemented under the hood and how it handles your scenarios. See this article as an example reference: https://builtin.com/data-science/b-tree-index

Now, you have a first search that checks for not null. We know that null is the unknown and therefore the RDBMS will not be able to properly compare any timestamp with null, see this:
```
mysql> select now() < null;
+--------------+
| now() < null |
+--------------+
|         NULL |
+--------------+
1 row in set (0.00 sec)

mysql> select now() > null;
+--------------+
| now() > null |
+--------------+
|         NULL |
+--------------+
1 row in set (0.00 sec)

mysql> select now() = null;
+--------------+
| now() = null |
+--------------+
|         NULL |
+--------------+
1 row in set (0.00 sec)
```
so if your indexed graph’s nodes are being compared against null, then each of the items will likely be compared to null and that will amount to as many comparisons as many nodes you have whereas your second query will quickly evaluate entire trees of nodes by comparing the root to your search criteria, greatly reducing the number of comparisons.

However, if you define an index on whether a field is null (see this for reference:https://dev.mysql.com/doc/refman/8.4/en/is-null-optimization.html), then the is null or is not null comparisons will be quick.

If you want to make them both quick, then you can create a multiple-column index, the first index criteria checking whether the field is null and the second index criteria would depend upon its value. Then you could filter by
```
`Visit`.`started_at` IS NOT NULL AND `Visit`.`started_at` > '2000-01-01'
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Mysql – IS NOT NULL date in sql or compare with raw data

Answers