skip to Main Content

This is a slightly more complicated reporting problem than I usually have. I want to find how many times customers have repurchased individual product listings. (Say, some products have really high sales, but we want to know how much are from first-time purchases and how much are repurchases.)

items_purchased:
+----------+------------+----------+
| buyer_id | listing_id | quantity |
+----------+------------+----------+
|     1234 |       5678 |        1 |
|     1234 |       3456 |        1 |
|     9012 |       3456 |        2 |
|     7901 |       5678 |        1 |
|     1234 |       5678 |        1 |
+----------+------------+----------+

From that data, I can see listing 5678 has been purchased 3 times but repurchased once. Listing 3456 has been purchased twice but never repurchased. I want to build a report like:

+------------+-----------+-------------+-----------------+
| listing_id | purchases | repurchases | repurchase_rate |
+------------+-----------+-------------+-----------------+
|       5678 |      1364 |         152 | 11%             |
|       3456 |       892 |         289 | 32%             |
|       3284 |        51 |          30 | 59%             |
|       8763 |         8 |           0 | 0%              |
+------------+-----------+-------------+-----------------+

I think I need a DISTINCT or GROUP BY somewhere, definitely a SUM I’m guessing, but I’m stumped how to do it. (The "repurchase_rate" I know how to do.)

4

Answers


  1. Off the top of my head, you could build a CTE that summarizes the count of purchases of a particular listing_id by buyer_id, subtracting 1 from each to include only repurchases:

    WITH repurchases AS
    (
        SELECT
            listing_id,
            SUM(repurchases) AS repurchases
        FROM
        (
            SELECT
                listing_id,
                buyer_id,
                COUNT(1) - 1 AS repurchases
            FROM
                items_purchased
            GROUP BY
                listing_id,
                buyer_id
        ) repurchases_by_buyer
        GROUP BY
            listing_id
    )
    SELECT
        items_purchased.listing_id,
        COUNT(1) AS purchases,
        (SELECT repurchases FROM repurchases WHERE repurchases.listing_id = items_purchased.listing_id) AS repurchases
    FROM
        items_purchased
    GROUP BY
        items_purchased.listing_id
    

    DB Fiddle

    This is probably not the most computationally efficient way to do this considering the somewhat gratuitous use of subqueries, but I’m confident there’s room for it to be optimized further to accommodate scaled usage.

    Login or Signup to reply.
  2. SELECT
      p.listing_id,
      COUNT(p.listing_id) AS purchases,
      COUNT(r.listing_id) AS repurchases,
      ROUND((COUNT(r.listing_id) / COUNT(p.listing_id)) * 100, 2) AS repurchase_rate
    FROM
      (SELECT listing_id, buyer_id, MIN(quantity) AS first_purchase
       FROM items_purchased
       GROUP BY listing_id, buyer_id) AS first_purchases
    LEFT JOIN items_purchased AS p
      ON first_purchases.listing_id = p.listing_id
    LEFT JOIN items_purchased AS r
      ON first_purchases.listing_id = r.listing_id
      AND first_purchases.buyer_id = r.buyer_id
      AND r.quantity > first_purchases.first_purchase
    GROUP BY p.listing_id;
    
    

    This query does the following:

    • Subquery first_purchases identifies the first purchase for each buyer_id and listing_id
    • The main query then LEFT JOINs the items_purchased table twice: once to count all purchases (p) and once to count repurchases (r)
    • The repurchases count is filtered to only include purchases where the quantity is greater than the first_purchase quantity
    • Finally, it calculates the repurchase_rate and formats the output

    Please note that the above query assumes that a repurchase is defined as any subsequent purchase where the quantity is greater than the first purchase quantity. Adjust the logic as needed for your specific definition of a repurchase.

    Login or Signup to reply.
  3. You can do:

    select *, 1.0 * repurchases / purchases as repurchase_rate
    from (
      select listing_id, sum(cnt) as purchases, sum(cnt - 1) as repurchases
      from (
        select listing_id, buyer_id, count(*) as cnt
        from items_purchased
        group by listing_id, buyer_id
      ) x
      group by listing_id
    ) y
    

    Result:

     listing_id  purchases  repurchases  repurchase_rate        
     ----------- ---------- ------------ ---------------------- 
     5678        3          1            0.33333
     3456        2          0            0.00000
    

    See running example at db<>fiddle.

    Login or Signup to reply.
  4. Use count() and count() - count(distinct) to count purchases and repurchases, respectively.

    SELECT listing_id, COUNT(buyer_id) purchases, 
    COUNT(buyer_id) - COUNT(DISTINCT buyer_id) repurchases,
    CONCAT(ROUND(100 * (1 - COUNT(DISTINCT buyer_id)/COUNT(buyer_id))), '%')  repurchase_rate
    FROM items_purchased 
    GROUP BY listing_id;
    
    listing_id purchases repurchases repurchase_rate
    3456 2 0 0%
    5678 3 1 33%

    db fiddle

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search