I am using phpMyAdmin that uses MySQL version 5.7.
I have a database table called condensed
which contains millions of unique emails with multiple "send dates", among other columns. Some of the emails will have a NULL send date.
Focusing on the just the email
and send_date
columns here.
Here is an example of the table:
| email | a_last_sent | b_last_sent | c_last_sent | d_last_sent | ..up to 14 dates
----------------------------------------------------------------------------------
| email1 | 2024-06-12 | 2024-05-25 | NULL | 2024-06-06 |
----------------------------------------------------------------------------------
| email2 | 2024-06-01 | 2024-06-16 | 2024-06-05 | 2024-06-19 |
----------------------------------------------------------------------------------
| email3 | NULL | NULL | 2024-05-12 | 2024-06-10 |
----------------------------------------------------------------------------------
| email4 | NULL | 2024-06-13 | NULL | 2024-05-11 |
----------------------------------------------------------------------------------
| email5 | 2024-06-09 | 2024-05-01 | NULL | NULL |
----------------------------------------------------------------------------------
I need to exclude anything that has more than 3 sent dates within the month of June.
So if an email has a sent date in 3 or columns within the month of June, do not return those records.
Using my example above, the results should looks something like this:
| email | a_last_sent | b_last_sent | c_last_sent | d_last_sent | ..up to 14 dates
----------------------------------------------------------------------------------
| email1 | 2024-06-12 | 2024-05-25 | NULL | 2024-06-06 |
----------------------------------------------------------------------------------
| email3 | NULL | NULL | 2024-05-12 | 2024-06-10 |
----------------------------------------------------------------------------------
| email4 | NULL | 2024-06-13 | NULL | 2024-05-11 |
----------------------------------------------------------------------------------
| email5 | 2024-06-09 | 2024-05-01 | NULL | NULL |
----------------------------------------------------------------------------------
The results above excluded email2
because it had 4 dates within the month of June.
email1
meets the criteria because it only has 2 dates within the month of June and 1 in May.
email3
, email4
, and email5
meet the criteria because they only have 1 date within the month of June and 1 in May.
I can write the below query:
SELECT
* FROM
`condensed`
WHERE
(
`a_last_sent` NOT BETWEEN '2024-06-01' AND '2024-06-30'
OR
`b_last_sent` NOT BETWEEN '2024-06-01' AND '2024-06-30'
OR
`c_last_sent` NOT BETWEEN '2024-06-01' AND '2024-06-30'
// remaining date columns
)
The query above excludes anything within the date range.
I’m just not sure how to incorporate the count to exclude anything that has over 3 dates within the specified date range (which in this case is June).
Please help and thank you.
2
Answers
Use the fact that TRUE counts as 1 when doing maths with it in MySQL.
The values contains NULLs – so we cannot concat or sum the expressions with the most part of the functions/operators.
I’d recommend you to concat the dates using CONCAT_WS() – this function skips NULLs. Then count the amount of according substrings.
For definite month use something like
For custom range use UNION, nested REPLACEs (v.5.7 does not support regular expression replace) or register and use according user-defined function.
Of course you may also use multiple COALESCE().
Also you may unpivot your data with multiple UNIONs then count by the common way.