skip to Main Content

I am using phpMyAdmin that uses MySQL version 5.7.

I have a database table called condensed which contains millions of unique emails with multiple "send dates", among other columns. Some of the emails will have a NULL send date.

Focusing on the just the email and send_date columns here.

Here is an example of the table:

| email  | a_last_sent | b_last_sent | c_last_sent | d_last_sent | ..up to 14 dates
----------------------------------------------------------------------------------
| email1 | 2024-06-12  | 2024-05-25  |    NULL     | 2024-06-06  |
----------------------------------------------------------------------------------
| email2 | 2024-06-01  | 2024-06-16  |  2024-06-05 | 2024-06-19  |
----------------------------------------------------------------------------------
| email3 |   NULL      |   NULL      |  2024-05-12 | 2024-06-10  |
----------------------------------------------------------------------------------
| email4 |   NULL      | 2024-06-13  |    NULL     | 2024-05-11  |
----------------------------------------------------------------------------------
| email5 | 2024-06-09  | 2024-05-01  |    NULL     |    NULL     |
----------------------------------------------------------------------------------  

I need to exclude anything that has more than 3 sent dates within the month of June.

So if an email has a sent date in 3 or columns within the month of June, do not return those records.

Using my example above, the results should looks something like this:

| email  | a_last_sent | b_last_sent | c_last_sent | d_last_sent | ..up to 14 dates
----------------------------------------------------------------------------------
| email1 | 2024-06-12  | 2024-05-25  |    NULL     | 2024-06-06  |
----------------------------------------------------------------------------------
| email3 |   NULL      |   NULL      |  2024-05-12 | 2024-06-10  |
----------------------------------------------------------------------------------
| email4 |   NULL      | 2024-06-13  |    NULL     | 2024-05-11  |
----------------------------------------------------------------------------------
| email5 | 2024-06-09  | 2024-05-01  |    NULL     |    NULL     |
----------------------------------------------------------------------------------  

The results above excluded email2 because it had 4 dates within the month of June.

email1 meets the criteria because it only has 2 dates within the month of June and 1 in May.

email3, email4, and email5 meet the criteria because they only have 1 date within the month of June and 1 in May.

I can write the below query:

SELECT 
  * FROM 
`condensed` 
WHERE 
(
  `a_last_sent` NOT BETWEEN '2024-06-01' AND '2024-06-30'
  OR
  `b_last_sent` NOT BETWEEN '2024-06-01' AND '2024-06-30'
  OR
  `c_last_sent` NOT BETWEEN '2024-06-01' AND '2024-06-30' 
  // remaining date columns     
)

The query above excludes anything within the date range.

I’m just not sure how to incorporate the count to exclude anything that has over 3 dates within the specified date range (which in this case is June).

Please help and thank you.

2

Answers


  1. Use the fact that TRUE counts as 1 when doing maths with it in MySQL.

    SELECT * FROM `condensed` 
    WHERE 
        (`a_last_sent` BETWEEN '2024-06-01' AND '2024-06-30')
        + (`b_last_sent` BETWEEN '2024-06-01' AND '2024-06-30')
        + (`c_last_sent` BETWEEN '2024-06-01' AND '2024-06-30') 
        // remaining date columns ... BTW bad table structure   
        <= 3; 
    
    Login or Signup to reply.
  2. The values contains NULLs – so we cannot concat or sum the expressions with the most part of the functions/operators.

    I’d recommend you to concat the dates using CONCAT_WS() – this function skips NULLs. Then count the amount of according substrings.

    For definite month use something like

    SELECT LENGTH(dateslist) - LENGTH(REPLACE(dateslist, '2024-06-', '2024-06')) AS amount, ...
    FROM ( SELECT CONCAT_WS(',', a_last_sent, b_last_sent, ..., x_last_sent) AS dateslist, ...
           ...
           )
    ...
    

    For custom range use UNION, nested REPLACEs (v.5.7 does not support regular expression replace) or register and use according user-defined function.

    Of course you may also use multiple COALESCE().

    Also you may unpivot your data with multiple UNIONs then count by the common way.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search