MySQL query to find the customers who have placed orders on consecutive days

user23860092
March 28, 2024
302 views
2 votes
3 Answers

I am trying to get the ids of the customers who have placed orders on consecutive days. The table is created as follows:

create table orders(
  orderid INT,
  orderdate date,
  customerid int
);

The values:

insert into orders (orderid, orderdate, customerid)
values(1,'2023-06-20',1),
    (2, '2023-06-21', 2),
    (3, '2023-06-22', 3),
    (4, '2023-06-22', 1),
    (5, '2023-06-23', 3),
    (6, '2023-06-22', 1),
    (7, '2023-06-26', 4),
    (8, '2023-06-27', 4),
    (9, '2023-06-29', 4),
    (10, '2023-06-29', 5),
    (11, '2023-06-30', 5),
    (12, '2023-06-28', 5),
    (13, '2023-06-25', 4),
    (14, '2023-06-24', 4),
    (15, '2023-06-30', 4);

The code that I wrote gave the output if ids with orders consecutive days but left the ids of customers who had a gap in their order despite having more number of orders before the gap actually occurred.
The code that I wrote:

with t1 as(
select customerid, orderdate,
case when lead(orderdate) over (partition by customerid order by orderdate) is null then 1
else abs(orderdate - lead(orderdate) over (partition by customerid order by orderdate)) end as gap
from orders)
select customerid, sum(gap) as consecutive
from t1
where gap>0
group by customerid
having count(*)=sum(gap) and count(*)>1;

The output:

+------------+------------------+
| customerid | consecutive_days |
+------------+------------------+
|          3 |                2 |
|          5 |                3 |
+------------+------------------+

The output I would want:

+------------+------------------+
| customerid | consecutive_days |
+------------+------------------+
|          3 |                2 |
|          4 |                4 |
|          4 |                2 |
|          5 |                3 |
+------------+------------------+

Since customer with customerid 4 has ordered from 2023-06-24 to 2023-06-27. The next order from the same customer is on 2023-06-29 and 2023-06-30 hence not continuous and should occur as a separate row.

Edit: The orders placed must be on consecutive days, irrespective of the number of orders placed on a single day.

Answers

This is a typical gaps and islands problem and here is one way of getting your desired result:

SELECT customerid, COUNT(DISTINCT orderdate) AS consecutive_days
FROM (
    SELECT *, orderdate - INTERVAL DENSE_RANK() OVER (PARTITION BY customerid ORDER BY orderdate) DAY AS grp
    FROM orders
) AS order_groups
GROUP BY customerid, grp
HAVING consecutive_days > 1;

Outputs:

customerid	consecutive_days
3	2
4	4
4	2
5	3

Here’s a db<>fiddle.

If you look at the result of the derived table you can see that by subtracting the DENSE_RANK from the orderdate we create a value we can use for grouping.

orderid	orderdate	customerid	grp
1	2023-06-20	1	2023-06-19
4	2023-06-22	1	2023-06-20
6	2023-06-22	1	2023-06-20
2	2023-06-21	2	2023-06-20
3	2023-06-22	3	2023-06-21
5	2023-06-23	3	2023-06-21
14	2023-06-24	4	2023-06-23
13	2023-06-25	4	2023-06-23
7	2023-06-26	4	2023-06-23
8	2023-06-27	4	2023-06-23
9	2023-06-29	4	2023-06-24
15	2023-06-30	4	2023-06-24
12	2023-06-28	5	2023-06-27
10	2023-06-29	5	2023-06-27
11	2023-06-30	5	2023-06-27

Note that I have used DENSE_RANK, instead of the more typical ROW_NUMBER, to handle customers placing multiple orders on the same day.

- SelVazi
- March 28, 2024 at 10:45 am
- 0 votes
0
This is an other solution to resolve this gaps and islands, it can be resolved by calculating the difference between row numbers (DENSE_RANK is used because various customers may place orders on the same day) , which assigns a unique identifier to each sequence of consecutive orders for a customer :
```
WITH cte AS (
  SELECT
    *, DENSE_RANK() over(ORDER BY orderdate) - DENSE_RANK() over (PARTITION BY customerid ORDER BY orderdate) as grp
  FROM orders
)
SELECT customerid, COUNT(DISTINCT orderdate) AS consecutive_days 
FROM cte
GROUP BY customerid, grp
HAVING consecutive_days > 1;
```
Results :
```
customerid  consecutive_days
3           2
4           4
4           2
5           3
```
Demo here
Login or Signup to reply.

- ZsoltSzilagyi
- March 28, 2024 at 10:52 am
- 0 votes
0
It really depends on yxour table size.
If you only have a couple hundred orders, and O=N² is viable, go with something as simple as
```
SELECT DISTINCT o1.customerid
FROM orders o1
JOIN orders o2 ON o1.customerid = o2.customerid
AND DATEDIFF(o1.orderdate, o2.orderdate) = 1;
```
For performance, put a single index over both customerid and orderdate.

The best way, though somewhat complex, are the MySQL LEAD and LAG functions.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.