skip to Main Content

Table 1

+--------+------------+-------+
|  URI   |    Date    | Name  |
+--------+------------+-------+
| Fred4  | 2023-04-05 | Fred  |
| Fred3  | 2023-04-01 | Fred  |
| Fred2  | 2023-03-15 | Fred  |
| Fred1  | 2023-03-06 | Fred  |
| Dave3  | 2023-05-22 | Dave  |
| Dave2  | 2023-05-11 | Dave  |
| Dave1  | 2023-05-03 | Dave  |
| Simon6 | 2023-05-20 | Simon |
| Simon5 | 2023-05-11 | Simon |
| Simon4 | 2023-04-21 | Simon |
| Simon3 | 2023-04-19 | Simon |
| Simon2 | 2023-04-12 | Simon |
| Simon1 | 2023-03-25 | Simon |
+--------+------------+-------+

Table 2

+--------+------------+--------+
|  URI   |    Date    |  Item  |
+--------+------------+--------+
| Fred4  | 2023-04-05 | Top    |
| Fred3  | 2023-04-01 | Shorts |
| Fred2  | 2023-03-15 | Band   |
| Fred1  | 2023-03-06 | Top    |
| Dave3  | 2023-05-22 | Shorts |
| Dave2  | 2023-05-11 | Shoes  |
| Dave1  | 2023-05-03 | Top    |
| Simon6 | 2023-05-20 | Shorts |
| Simon5 | 2023-05-11 | Band   |
| Simon4 | 2023-04-21 | Shorts |
| Simon3 | 2023-04-19 | Top    |
| Simon2 | 2023-04-12 | Shoes  |
| Simon1 | 2023-03-25 | Shoes  |
+--------+------------+--------+

For each URI I am trying to pull in each previous item so that Item1 = last bought, item2 = 2nd last bought etc:

+--------+------------+--------+--------+-------+-------+-------+
|  URI   |    Date    | Item1  | Item2  | Item3 | Item4 | Item5 |
+--------+------------+--------+--------+-------+-------+-------+
| Fred4  | 2023-04-05 | Shorts | Band   | Top   |       |       |
| Fred3  | 2023-04-01 | Band   | Top    |       |       |       |
| Fred2  | 2023-03-15 | Top    |        |       |       |       |
| Fred1  | 2023-03-06 | NULL   |        |       |       |       |
| Dave3  | 2023-05-22 | Shoes  | Top    |       |       |       |
| Dave2  | 2023-05-11 | Top    |        |       |       |       |
| Dave1  | 2023-05-03 | NULL   |        |       |       |       |
| Simon6 | 2023-05-20 | Band   | Shorts | Top   | Shoes | Shoes |
| Simon5 | 2023-05-11 | Shorts | Top    | Shoes | Shoes |       |
| Simon4 | 2023-04-21 | Top    | Shoes  | Shoes |       |       |
| Simon3 | 2023-04-19 | Shoes  | Shoes  |       |       |       |
| Simon2 | 2023-04-12 | Shoes  |        |       |       |       |
| Simon1 | 2023-03-25 | NULL   |        |       |       |       |
+--------+------------+--------+--------+-------+-------+-------+

Currently I can get the most recent last item but I am struggling with the 2nd, 3rd etc. despite trying a few different things.

with all_data as (
    SELECT t1.URI, t1.date, t1.name
    ,t2.Item as Item1, t2.Item as Item2, t2.Item as Item3, t2.Item as Item4, t2.Item as Item5
    FROM Table1 t1
    LEFT OUTER JOIN Table2 t2 on t1.name = t2.name and t1.date > t2.date
)
SELECT URI, date, name, Item1, Item2, Item3, Item4, Item5
from all_data
group by URI, date, name
order by URI, date, name
; 

Any help would be great 🙂 I’m pretty sure the query needs to be quite a lot bigger unless someone knows a few tricks.

Using MySQL and HeidiSQL client.

2

Answers


  1. In MySQL, one way you can accomplish this is by utilizing window functions and conditional aggregation. This will allow you to get the nth item per each URI, if they exist.

    The logic in SQL would look something like this:

    WITH all_data AS (
        SELECT t1.URI, t1.date, t1.name,
            t2.Item,
            ROW_NUMBER() OVER (PARTITION BY t1.URI ORDER BY t2.Date DESC) as item_rank
        FROM Table1 t1
        LEFT JOIN Table2 t2 ON t1.URI = t2.URI AND t1.date >= t2.date
    )
    SELECT 
        URI, 
        MAX(date) as date, 
        MAX(name) as name, 
        MAX(CASE WHEN item_rank = 1 THEN Item END) as Item1,
        MAX(CASE WHEN item_rank = 2 THEN Item END) as Item2,
        MAX(CASE WHEN item_rank = 3 THEN Item END) as Item3,
        MAX(CASE WHEN item_rank = 4 THEN Item END) as Item4,
        MAX(CASE WHEN item_rank = 5 THEN Item END) as Item5
    FROM all_data
    GROUP BY URI
    ORDER BY URI;
    

    This will first assign a rank to each row within each URI based on the Date in descending order in the all_data subquery. The main query then uses conditional aggregation to select the Item for each URI based on the rank of each row.

    Please, note that window functions like ROW_NUMBER() are only available in MySQL 8.0 and later. If you’re using a version prior to 8.0, this approach won’t work and you’ll need to consider a different solution.

    Login or Signup to reply.
  2. with t as (
      select  t1.uri, t1.date, t1.name, t2.Item,
               row_number() over (partition by t1.name order by t2.Date desc ) rn
      from Table1 t1 join Table2 t2 on t1.URI = t2.URI and t1.date >= t2.date),
    joined as (
      select a.uri, a.date, a.name, b.item, b.rn - a.rn rn 
      from t a left join t b on a.name = b.name and a.rn < b.rn )
    select uri, date, 
           max(case rn when 1 then item end) item1,
           max(case rn when 2 then item end) item2,
           max(case rn when 3 then item end) item3,
           max(case rn when 4 then item end) item4,
           max(case rn when 5 then item end) item5
    from joined group by uri, date, name order by name, date desc
    

    dbfiddle demo

    Steps:

    • number the rows in the subquery,
    • perform a self-join of this subquery based on name and a.rn < b.rn, which multiplies entries including desired empty rows (Simon1, Fred1),
    • subtract row numbers a.rn – b.rn, this gives correct value for pivot to operate,
    • make classic pivot,
    • sort rows by undisplayed column name and date desc, if you want it this way.

    You win an award in the category of strange pivots 😉

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search