I am trying to get the category of specific students health inspection for each month. However, sometimes there are multiple entries for a month. How do I get the latest entry for the month using SQL?
Below is the SQL I am using:
SELECT mt_tagged_student.student_id,
mt_inspection.category,
mt_inspection.created_date as entry_date
FROM public.mt_tagged_student
LEFT JOIN public.mt_student On mt_tagged_student.student_id = mt_student.id
LEFT JOIN public.mt_inspection On mt_inspection.student_id = mt_student.id
where mt_inspection.created_date between '2023-10-1 00:00:01' AND '2023-10-31 23:59:59'
and mt_inspection.category = 1
ORDER BY mt_tagged_student.student_id ASC, entry_date Desc
Here’s some of the result of the SQL,
The highlighted one show that sometimes there are two entries in a month. How do I get the latest entry of that month only?
3
Answers
You can use
NOT EXISTS
as follows:OR you can use
ROW_NUMBER
as follows:A textbook example for
DISTINCT ON
:Should also perform best for just a couple entries max per month like you indicate. See:
Also, don’t use
BETWEEN
with timestamps.I converted both instances of
LEFT [OUTER] JOIN
to plain[INNER] JOIN
because the former make no sense in combination withWHERE
conditions on the right table. See:I threw in
DESC NULLS LAST
at first, because theLEFT JOIN
can introduce null values. That’s removed now, but we still don’t know whethercreated_date
itself is definedNOT NULL
. See:USING
instead ofON
is a totally optional syntax shortcut. But I warmly recommend using table aliases as demonstrated to cut down on the noise.