I’m writing a solution for a DataLemur SQL question and I almost had it right but there is one row missing.
I have written this query myself:
WITH CTE AS
(
SELECT
user_id,
transaction_date,
DENSE_RANK() OVER (PARTITION BY user_id order by transaction_date) as ranked_transact
FROM user_transactions
)
SELECT
transaction_date,
user_id,
COUNT(user_id) as purchase_count
FROM CTE
WHERE ranked_transact IN (SELECT MAX(ranked_transact) FROM CTE)
GROUP BY user_id, transaction_date
Then, when I submitted I have seen that I’m missing a row: for some reason, the WHERE clause condition didn’t work for user_id = 115.
The desired results were supposed to be:
transaction_date | user_id | purchase_count |
---|---|---|
07/11/2022 10:00:00 | 123 | 1 |
07/12/2022 10:00:00 | 115 | 1 |
07/12/2022 10:00:00 | 159 | 2 |
But my query returned:
transaction_date | user_id | purchase_count |
---|---|---|
07/11/2022 10:00:00 | 123 | 1 |
07/12/2022 10:00:00 | 159 | 2 |
I checked the solution and saw that they ordered the ranking by descending order of transaction_date and then they used WHERE transaction_date=1 . In my query, I ordered the ranking by ascending order and asked the select statement to return the max ranking value which is supposed to give me the most recent transaction row, no?
I’m probably missing something silly but can someone explain to me where lies the problem and why is user_id 115 not included when I use the count function?
Thank you!
2
Answers
The
MAX(ranked_transact)
depends on theuser_id
. In your code you only select the records where theranked_transact
is equal to the MAX()`, which is only true for the 2 records shown.DENSE_RANK would produce 1 for every recent transaction date, later used on the WHERE condition
Query,
Data Example