Currently, I have a query that looks like this:
SELECT send_id, count(id) FROM `table1`
WHERE date BETWEEN '2024-09-01%' AND '2024-09-30%'
GROUP BY send_id;
This returns a result that looks like this:
| send_id | count(id) |
-------------------------------
| 00123 | 32 |
| 00234 | 12 |
| 00567 | 100 |
| 00890 | 07 |
-------------------------------
I want to add the date and get the count for each send_id for each day.
I wrote this query:
SELECT DATE(date) as 'date', send_id, count(id) FROM `table1`
WHERE date BETWEEN '2024-09-01%' AND '2024-09-30%'
GROUP BY date, send_id;
I use DATE(date) because the column date is a datetime. I don’t need the time, just the date.
I am getting results that look like this:
| date | send_id | count(id) |
---------------------------------------------
| 2024-09-01 | 00123 | 03 |
| 2024-09-01 | 00123 | 10 |
| 2024-09-01 | 00567 | 20 |
| 2024-09-01 | 00567 | 05 |
---------------------------------------------
The results I was looking for should be like this:
| date | send_id | count(id) |
---------------------------------------------
| 2024-09-01 | 00123 | 13 |
| 2024-09-01 | 00567 | 25 |
---------------------------------------------
Because send_id had a total id count of 13 on 2024-09-01, but it’s breaking it off into 2 rows.
The id is the auto_incremented field.
I think it’s because of the datetime, but I’m not sure.
How can I fix the query so that it gets the id count of each send_id for each day?
5
Answers
Just adding comment as answer. Check out your
GROUP BY
, where you need to also group by date(date).you can use try_cast to convert the data into date_type
When you use
GROUP BY date
it’s grouping by thedate
column in the table, not thedate
alias that you assigned toDATE(date)
.You can avoid this problem by using a different name for the alias:
I’d recommend a derived table, to make the date to DATE(date) conversion in advance, then
GROUP BY
its result. (Very handy trick to avoid repeating complex group by expressions.)Given you have datetime columns, this
WHERE
clause will perform so much better:Thanks to cultural/internationalization issues, treating dates as strings is much slower and more error-prone than we’d like to believe. It’s something to avoid.
Using the
LIKE '2024-09-01%'
expression forces the database to implicitly do this slower conversion from datetime to a string type for every row in the table, even rows that won’t be used. And making this conversion means the converted values no longer match values used with any index you might have had, making the index worthless for this query.Whenever possible you want to write your conditional expressions to avoid mutating the values stored in the table.
In the case of dates, this commonly means writing two expressions forming a half-open range with an exclusive upper bound for the day after the close of the range.
Put it all together like this:
As a bonus, for month ranges like this you no longer have to worry about finding the correct last day of the month. Instead you will always use the first day of the month following.