select *
from `attendance_marks`
where exists (select *
from `attendables`
where `attendance_marks`.`attendable_id` = `attendables`.`id`
and `attendable_type` = 'student'
and `attendable_id` = 258672
and `attendables`.`deleted_at` is null
)
and (`marked_at` between '2022-09-01 00:00:00' and '2022-09-30 23:59:59')
this query is taking too much time approx 7-10 seconds.
I am trying to optimize it but stuck at here.
Please help me optimize it a little bit.
For reference
number of rows in attendable = 80966
number of rows in attendance_marks = 1853696
2
Answers
I think if we use
JOINS
instead ofSub-Query
, then it will be more performant. Unfortunately, I don’t have the exact data to be able to benchmark the performance.I’m not sure if your business requirement allows changing the PK, and adding index. Incase it does then:
attendable_id
.attendables.id
is PK. Incase not, add an index to it. Or preferably make it the PK.attendable_type
have a lot of different values. Then consider adding an index there too.marked_at
, instead round to the nearest minute. In our case, we can round off2022-09-30 23:59:59
to2022-10-01 00:00:00
.and have these
Note that the datetime range works for any granularity.
(Be sure to check that I got the aliases for the correct tables.)
This formulation, with these indexes should allow the Optimizer to pick which table is more efficient to start with.