skip to Main Content

How to calculate median for selling_eur in this table?

id listing_id category amount original_eur selling_eur user_id rounds_id
0 10088 m1-1579cd77-6863-469d-8798-9e1bef199d2f Category 2 2 nan 350.11 eu-central-1:6fcfad76-ef06-4ea6-9e14-9fa4a5d62b8f 31
1 10089 m1-2adfde90-de27-4048-aa3e-bc7e76c03b0c Prime Seats 4 nan 962.81 eu-central-1:6fcfad76-ef06-4ea6-9e14-9fa4a5d62b8f 28
2 10090 m3-024c20ad-e21d-4bb3-966e-44c69e760b15 Category 1 4 200 750 eu-central-1:33d3dbe4-6414-4949-ae21-f2bc8819ae4d 29
3 10091 m1-c612ad6c-d000-4bb5-8dc4-55e3b49d54cd Category 2 4 150 450 eu-central-1:fe695442-13a6-42e6-a015-046415bf369d 28
4 10092 m1-cd449fb2-6134-428b-946a-fc8a7bec30e7 Category 1 4 nan 647 eu-central-1:b926a781-4fc0-4f71-bd0f-69b50b0be68d 28

Calculate average is quite simle: sum( selling_eur * amount ) / sum ( amount).
But that about median?

Is it simple make arrays with prices, example for 1 row:
{350.1, 350.1}.
And make calculation for this values?

2

Answers


  1. You could repeat your selling_eur exactly amount times in a subquery and feed that to percentile_cont(). Demo at db<>fiddle:

    select percentile_cont(.5)within group(order by selling_eur2)
    from your_table cross join lateral
    (select generate_series(1,amount),selling_eur as selling_eur2);
    
    percentile_cont
    647

    You can check what percentile_cont() saw and verify that’s the median:

    select row_number()over(order by selling_eur2)
          ,selling_eur
          ,id
          ,left(listing_id,6) as listing_id
          ,category
          ,amount
          ,n
          ,original_eur
    from your_table cross join lateral
    (select generate_series(1,amount)n,selling_eur as selling_eur2)
    order by selling_eur2;
    
    row_number selling_eur id listing_id category amount n original_eur
    1 350.11 10088 m1-157 Category 2 2 2 NaN
    2 350.11 10088 m1-157 Category 2 2 1 NaN
    3 450 10091 m1-c61 Category 2 4 1 150
    4 450 10091 m1-c61 Category 2 4 4 150
    5 450 10091 m1-c61 Category 2 4 3 150
    6 450 10091 m1-c61 Category 2 4 2 150
    7 647 10092 m1-cd4 Category 1 4 4 NaN
    8 647 10092 m1-cd4 Category 1 4 1 NaN
    middle value 9 647 10092 m1-cd4 Category 1 4 2 NaN
    also middle value 10 647 10092 m1-cd4 Category 1 4 3 NaN
    11 750 10090 m3-024 Category 1 4 1 200
    12 750 10090 m3-024 Category 1 4 2 200
    13 750 10090 m3-024 Category 1 4 3 200
    14 750 10090 m3-024 Category 1 4 4 200
    15 962.81 10089 m1-2ad Prime Seats 4 1 NaN
    16 962.81 10089 m1-2ad Prime Seats 4 4 NaN
    17 962.81 10089 m1-2ad Prime Seats 4 3 NaN
    18 962.81 10089 m1-2ad Prime Seats 4 2 NaN
    Login or Signup to reply.
  2. Here’s the one with window functions. The subquery adds to each row a range of positions according to the selling_eur order, and from that you can grab the value right below and right above the middle position (floor() and ceil() filters), then divide their sum by two: demo

    select sum( selling_eur * amount ) / sum ( amount) as "average"
          ,( max(selling_eur)filter(where floor(middle_index)::int8 <@ int8range)
            +max(selling_eur)filter(where  ceil(middle_index)::int8 <@ int8range)
           )/2. as "median"
    from(select *,int8range(sum(amount)over w1 -amount+1,
                            sum(amount)over w1,'[]')
                 ,(1.+sum(amount)over())/2. as middle_index
         from your_table
         window w1 as (order by selling_eur) ) as a;
    
    average median
    663.3033333333333333 647.0000000000000000
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search