I’m having great difficulty writing this query and cannot find any answers online which could be applied to my problem.
I have a couple of tables which looks similar to the below with. Each purchase date corresponds with an item purchased.
Cust_ID | Purchase_Date |
---|---|
123 | 08/01/2022 |
123 | 08/20/2022 |
123 | 09/05/2022 |
123 | 10/08/2022 |
123 | 12/25/2022 |
123 | 01/26/2023 |
The result I am looking for should contain the customers ID, a range of the purchases, the number of consecutive months they had made a purchase (regardless of which day they purchased), and a count of how many purchases they had made in the time frame. The result should look something like the below for my example.
Cust_ID | Min Purchase Date | Max Purchase Date | Consecutive Months | No. Items Purchased |
---|---|---|---|---|
123 | 08/01/2022 | 10/08/2022 | 3 | 4 |
123 | 12/25/2022 | 01/26/2023 | 2 | 2 |
I have tried using CTEs with querys similar to
WITH CTE as
(
SELECT
PaymentDate PD,
CustomerID CustID,
DATEADD(m, -ROW_NUMBER() OVER (PARTITION BY c.CustomerID ORDER BY
DATEPART(m,PaymentDate)), PaymentDate) as TempCol1,
FROM customers as c
LEFT JOIN payments as p on c.customerid = p.customerid
GROUP BY c.CustomerID, p.PaymentDate
)
SELECT
CustID,
MIN(PD) AS MinPaymentDate,
MAX(PD) AS MaxPaymentDate,
COUNT(*) as ConsecutiveMonths,
FROM CTE
GROUP BY CustID, TempCol1
However, the above failed to properly count consecutive months. When the payment dates matched a month apart (e.g. 1/1/22 – 2/1/22), the query properly counts the consecutive months. However, if the dates do not match from month to month (e.g. 1/5/22 – 2/15/22), the count breaks.
Any guidance/help would be much appreciated!
2
Answers
You need to use the
dense_rank
function instead of therow_number
, this will give the same rank for the same months and avoid breaking the grouping column. Also, you need to aggregate for ‘year-month’ of the grouping date column.See demo on MySQL
You tagged your question with MySQL, while it seems that you posted an SQL Server query syntax, for SQL Server just use
dateadd(month, -dense_rank() over (partition by Cust_ID order by year(Purchase_Date), month(Purchase_Date)), Purchase_Date)
.See demo on SQL Server.
This is just a small enhancement on the answer already given by ahmed. If your date range for this query is more than a year, then
year(M.Purchase_Date) + month(M.Purchase_Date)
will be2024
for both2022-02-01
and2023-01-01
as YEAR() and MONTH() both return integer values. This will return incorrect count of consecutive months. You can change this to use CONCAT() or FORMAT(). Also, theCOUNT(*)
forItemsPurchased
should be counting the right hand side of the join, as it is a LEFT JOIN.Here’s a db<>fiddle