skip to Main Content

I have a query below:

select date_trunc('day', ordertimestamp) as days 
, product
, revenue
from table

The result is :

days       |   product    | revenue

2022-07-10 |   shoes      |  $ 430
2022-07-10 |   chocolate  |  $ 10
2022-07-10 |   shirts     |  $ 3.25
2022-07-10 |   phones     |  $ 500
2022-01-02 |   vests      |  $ 200
2022-02-03 |   computers  |  $ 700

I need to put this table in a manner whereby the top 3 product categories by revenue for 2022-07-10 appear on ONE row like this:

2022-07-10 |  phones, shoes, chocolate

as you can see they appear in descending order of revenue

I tried something like this:

SELECT
array(select products from dataset order by revenue desc limit 3) as top_three_product_categories_by_revenue
FROM table 

But I’m not sure if it is accurate

3

Answers


  1. row_number() function to assign number to each row by a group and order .

    rn<3 to get only the 3 top products per date.

    with cte as (
      select days, product, row_number() over (partition by days order by revenue ASC) as rn
      from mytable
    )
    select days, string_agg(product, ', ' ORDER BY rn DESC) as top_products
    from cte
    where rn <= 3
    group by days
    

    Demo here

    Login or Signup to reply.
  2. This code will work on DuckDB but can be easily ported to other databases. The first portion just generates fake data.

    with cte_sequence as (
        select * from generate_series(1,100) as id
    ),
    cte_base_data as (
        select
            id,
            timezone('utc', now()) - floor(random()*id*10000)::int) as order_datetime,
            'product' || floor(3*random())::int::text as product,
            random()*100 as revenue
        from
            cte_sequence
    ),
    cte_top_products as (
        select
            date_trunc('day', order_datetime) as order_date,
            product,
            revenue,
            row_number() over(partition by order_date order by revenue desc) as row_id
        from
            cte_base_data
    )
    select
        order_date,
        string_agg(distinct product, ', ') as top_products_by_revenue
    from
        cte_top_products
    where
        row_id <= 3
    group by 1
    order by 1
    
    Login or Signup to reply.
  3. As the others already posted in the meantime:

    ROW_NUMBER() partitioned by the date column over ORDER BY revenue DESC, then filter by the resulting row number less than 4, group by the date and STRING_AGG() of product .

    marco ~/1/Vertica/supp $ psql -af top3list.sql
    WITH
    indata(dt,product,revenue) AS (
              SELECT DATE '2022-07-10','shoes'    ,430
    UNION ALL SELECT DATE '2022-07-10','chocolate',10
    UNION ALL SELECT DATE '2022-07-10','shirts'   ,3.25
    UNION ALL SELECT DATE '2022-07-10','phones'   ,500
    UNION ALL SELECT DATE '2022-01-02','vests'    ,200
    UNION ALL SELECT DATE '2022-02-03','computers',700
    )
    ,
    w_rn AS (
      SELECT
        *
      , ROW_NUMBER() OVER w AS rn
      FROM indata
      WINDOW w AS (PARTITION BY dt ORDER BY revenue DESC)
    )
    SELECT
      dt
    , STRING_AGG(product,',' ORDER BY revenue DESC) AS top3
    FROM w_rn
    WHERE rn < 4
    GROUP BY dt
    ;
         dt     |          top3          
    ------------+------------------------
     2022-01-02 | vests
     2022-02-03 | computers
     2022-07-10 | phones,shoes,chocolate
    
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search