EDIT
I misread my initial error and blamed the INDEX not being used on the wrong columns.
I was able to recreate the issue that I saw and the solution that ysth suggested worked.
Below are the create tables statements, inserts to the tables, and two queries – one that has the error and another with the solution which does not have it.
# Make tables and indices
DROP TABLE a;
DROP TABLE b;
create table a
(
DT DATE,
USER INT,
COMMENT_SENTIMENT INT,
PRIMARY KEY (USER, DT));
CREATE INDEX a_DT_USER_IDX ON a (DT,USER);
create table b
(
id int auto_increment primary key,
DT DATETIME(6),
USER mediumtext,
COMMENT_SENTIMENT INT);
CREATE INDEX b_DT_USER_IDX ON b (DT);
CREATE UNIQUE INDEX b_DT_USER ON b (USER(16), DT);
# Insert some dummy data
INSERT INTO a VALUES('2023-01-01', 5, 4);
INSERT INTO b VALUES(NULL, '2023-01-01 00:00:00', 5, 4);
# Explain that shows the issue I was seeing.
EXPLAIN
SELECT *
FROM a
JOIN b
ON a.DT = b.DT
AND a.USER = b.USER;
# Out
# 1,SIMPLE,a,,ALL,"PRIMARY,a_DT_USER_IDX",,,,1,100,
# 1,SIMPLE,b,,ref,"b_DT_USER,b_DT_USER_IDX",b_DT_USER_IDX,9,a.DT,1,100,Using index condition; Using where
[2023-01-24 18:00:14] [HY000][1739] Cannot use ref access on index 'b_DT_USER' due to type or collation conversion on field 'USER'
[2023-01-24 18:00:14] [HY000][1003] /* select#1 */ select `a`.`DT` AS `DT`, `a`.`USER` AS `USER`,`a`.`COMMENT_SENTIMENT` AS `COMMENT_SENTIMENT`,`b`.`id` AS `id`,`b`.`DT` AS `DT`,`b`.`USER` AS `USER`,`b`.`COMMENT_SENTIMENT` AS `COMMENT_SENTIMENT` from `a` join `b` where ((`a`.`DT` = `b`.`DT`) and (`a`.`USER` = `b`.`USER`))
# Explain with the fix ysth suggested
EXPLAIN
SELECT *
FROM a
JOIN b
ON a.DT = b.DT
AND a.USER = CAST(b.USER AS DECIMAL );
# 1,SIMPLE,a,,ALL,"PRIMARY,a_DT_USER_IDX",,,,1,100,
# 1,SIMPLE,b,,ref,b_DT_USER_IDX,b_DT_USER_IDX,9,a.DT,1,100,Using index condition; Using where
# [2023-01-24 18:04:24] [HY000][1003] /* select#1 */ select `a`.`DT` AS `DT`,`a`.`USER` AS `USER`,`a`.`COMMENT_SENTIMENT` AS `COMMENT_SENTIMENT`,`b`.`id` AS `id`,`b`.`DT` AS `DT`,`b`.`USER` AS `USER`,`b`.`COMMENT_SENTIMENT` AS `COMMENT_SENTIMENT` from `a` join `b` where ((`a`.`DT` = `b`.`DT`) and (`a`.`USER` = cast(`b`.`USER` as decimal(10,0))))
# [2023-01-24 18:04:24] 2 rows retrieved starting from 1 in 359 ms (execution: 250 ms, fetching: 109 ms)
__
The below information is incorrect. Please use the edit to see the issue I was having and it’s solution.
I have three tables a
, b
, and c
in my MySQL 5.7 database. SHOW CREATE
statements for each table are:
CREATE TABLE `a` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`DT` date DEFAULT NULL,
`USER` int(11) DEFAULT NULL,
`COMMENT_SENTIMENT` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `a_DT_USER_IDX` (`DT`,`USER`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `b` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`DT` datetime DEFAULT NULL,
`USER` int(11) DEFAULT NULL,
`COMMENT_SENTIMENT` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `b_DT_USER_IDX` (`DT`,`USER`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `c` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`DT` date DEFAULT NULL,
`USER` int(11) DEFAULT NULL,
`COMMENT_SENTIMENT` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `b_DT_USER_IDX` (`DT`,`USER`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
Table a
has a DATE
column a.DT
, table b
has a DATETIME
column b.DT
, and table c
has a DATE
column c.DT
.
All of these DT columns are indexed.
As a caveat, while b.DT
is a DATETIME
, all of the ‘time’ portions in it are 00:00:00
and they always will be. It probably should be a DATE
, but I cannot change it.
I want to join table a
and table b
on their DT columns, but explain tells me that their indices are not used:
Cannot use ref access on index 'b.DT_datetime_index' due to type or collation conversion on field 'DT'
When I join table a
and b
on a.DT
and b.DT
SELECT *
FROM a
JOIN b
ON a.DT = b.DT;
The result is much slower than when I do the same with a and c
SELECT *
FROM a
JOIN c
ON a.DT = c.DT;
Is there a way to use the indices in join from the first query on a.DT = b.DT
, specifically without altering the tables? I’m not sure if b.DT
having only 00:00:00
for the time portion could be relevant in a solution.
The end goal is a faster select using this join.
Thank you!
— What I’ve done section —
I compared the joins between a.DT = b.DT
and a.DT = c.DT
, and saw the time difference.
I also tried wrapping b
‘s DT
column with DATE(b.DT)
, but explain gave the same issue, which is pretty expected.
3
Answers
MySQL won’t use an index to join
DATE
andDATETIME
columns.You can create a virtual column with the corresponding
DATE
and use that.Assuming you want to read a and join b rows, you can just do
If the other way around, then
No need for a virtual column.
Virtually any function call is "not sargable " That is,
CAST(b.USER AS DECIMAL )
prevents the use of an index.Do not mix strings and ints in comparisons. The string will be converted to numeric. If the string is a literal, such as
'123'
then the Optimizer is smart enough to do that once. If it is a column name, it must check every row.Tip: If you are likely to test for one
user
and arange
of dates, then this works better than the opposite order.(You may need an index starting with dt for other queries.)