I need an SQL query for the next question.
For example I have a table:
| id | element | dt |
| -------- | -------- | -------- |
| 1 | a | 22/04/22 |
| 2 | a | 22/04/22 |
| 1 | b | 27/04/22 |
| 1 | a | 23/04/22 |
| 3 | b | 22/04/22 |
| 1 | a | 22/04/22 |
| 1 | a | 22/04/22 |
| 3 | b | 23/04/22 |
| 3 | b | 25/04/22 |
| 1 | a | 27/04/22 |
| 1 | c | 26/04/22 |
| 1 | d | 26/04/22 |
| 1 | g | 25/04/22 |
| 1 | b | 27/04/22 |
I want to find the percentage of similar items that the user (id) sees in one week (but not counting on the same day). For example, for this table, the answer is:
| id | percentage |
| -------- | --------- |
| 1 | 0.3 |
| 2 | 0.0 |
| 3 | 1 |
for id 1 answer is 0.3, because if we are taking a window for one week (7 days) he has seen an element "a" 3 times at 22/04/22, 23/04/22 and 27/04/22. He seen "a" at 22/04/22 a few times, but we don’t counting it. At he seen 10 times at total. So answer is 3/10 = 0.3
I tries smth like this, but it doesn’t solve my problem exactly.
SELECT
dt,
id,
COUNT(DISTINCT element) AS total,
(
SELECT COUNT(DISTINCT element)
FROM table newt
WHERE newt.dt = dt
) AS same_day_elements,
COUNT(DISTINCT element) / (SELECT COUNT(DISTINCT element) FROM table tempt WHERE date = tempt.date) * 100 AS percentage
FROM
table t
GROUP BY
dt,
id
2
Answers
You should think in terms of first collapsing your data set and then performing your calculations on that reduced data set. The problem becomes a whole lot easier. So, first count all the id-element combinations, then embed it in a calculation:
This is based on Chris’s answer, with the subquery containing counts for distinct date and all rows for each id, element group:
Output:
In the comments above you extended your question with:
It is not completely clear what you want for this, but assuming you want to group by week, you could add week number (WEEK(date[,mode])) to both the inner and outer groupings:
Here’s a db<>fiddle