skip to Main Content

I have a table in Postgres with an hstore column like:

"surface => concrete, ramp => yes, incline => up, highway => footway"
"surface => asphalt, lit => no, source => survey, incline => 12.6%, highway => footway"
"bicycle => yes, surface => asphalt, highway => footway, segregated => no"

Now I would like to analyse the keys among themselves. Therefore I want to know for the whole dataset how often the keys occur in combination pairwise.

The result should look something like this:

|      Key 1          |     Key 2        |       Count      |
|---------------------|------------------|------------------|
|      surface        |    source        |         1        |
|      surface        |    highway       |         3        |
|      surface        |    incline       |         2        |
|      highway        |    bicycle       |         1        |
.....

2

Answers


  1. Use hstore’s each() to expand it out, then join and aggregate:

    with expand as (
      select id, k, v
        from htab 
             cross join lateral each(hcol) as c(k, v)
    ), pairings as (
      select t1.id, t1.k as key1, t2.k as key2
        from expand t1
             join expand t2
                on t2.id = t1.id
               and t2.k < t1.k
    )
    select key1, key2, count(*)
      from pairings
     group by key1, key2
     order by key1 desc, key2 desc;
    

    fiddle

    Login or Signup to reply.
  2. Use the function akeys() to get all keys of the hstore column as an array, generate all distinct pairs of the arrays elements with generate_subscripts() and group the result by the pairs:

    select akeys[i] as key1, akeys[j] as key2, count(*)
    from my_table
    cross join akeys(hstore_col)
    cross join generate_subscripts(akeys, 1) as i
    cross join generate_subscripts(akeys, 1) as j
    where akeys[i] > akeys[j]
    group by 1, 2
    order by 1 desc, 2 desc
    

    Test it in db<>fidlle.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search