skip to Main Content

I have table like

WITH  vals (k, v, z, y, m) AS (VALUES (0, 0, 1, 8, 0), (1, 2, 1, 1, 6), (1, 0, 0, 2, 2),
(5, 4, 6, 8, 9), (0, 0, 0, 6, 6))
select vals.* from vals
k v z y m
0 0 1 8 0
1 2 1 1 6
1 0 0 2 2
5 4 6 8 9
0 0 0 6 6

I want to calculate modal value for each row. If there are multiple modal values, return such value, that occurs first.

k v z y m modal
0 0 1 8 0 0
1 2 1 1 6 1
1 0 0 2 2 2
5 4 6 8 9 5
0 0 0 6 6 0

How can I do it in PostreSQL?

2

Answers


  1. Assuming only three value columns, we can handle your requirement using CASE expressions:

    SELECT
        k,
        v,
        z,
        CASE WHEN k = v OR z = k THEN k
             WHEN v = z THEN v
             ELSE k END AS modal
    FROM vals;
    

    Demo

    The logic here is that if we can find two columns in agreement, then by definition their common value must be the mode of the three columns. If we cannot find any pair with a common value, then we default to returning the first value, which is the k column.

    Login or Signup to reply.
  2. PostgreSQL has a built-in mode()within group(order by e) function:

    mode () WITHIN GROUP ( ORDER BY anyelement ) → anyelement
    Computes the mode, the most frequent value of the aggregated argument (arbitrarily choosing the first one if there are multiple equally-frequent values). The aggregated argument must be of a sortable type.

    Which matches the Wikipedia definition of a modal value:

    Type Description Example Result
    Arithmetic mean Sum of values of a data set divided by number of values (1+2+2+3+4+7+9) / 7 4
    Median Middle value separating the greater and lesser halves of a data set 1, 2, 2, 3, 4, 7, 9 3
    Mode Most frequent value in a data set 1, 2, 2, 3, 4, 7, 9 2

    The function is an ordered-set aggregate, not a variadic, so you’d have to pretend values in each row are a column. You can pack them into an array[], then unnest() it:
    demo at db-fiddle

    WITH arrs as(select *,array[k,v,z] as arr 
                 from vals)
    select k,v,z
          ,(select mode()within group(order by e) 
            from unnest(arr) as e) as modal
    from arrs;
    

    Or try unpivoting, then re-aggregating:

    WITH one_column as(select ctid,k as c from vals
                       union all
                       select ctid,v from vals
                       union all
                       select ctid,z from vals)
    select min(vals.k)
          ,min(vals.v)
          ,min(vals.z)
          ,mode()within group(order by c)
    from one_column
    join vals on one_column.ctid=vals.ctid
    group by ctid;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search