skip to Main Content

I want to query the jsonb column in the below table, sum all fields named time, and then query for any object ‘"type": "Unknown" and also have a time that is greater than 10% of the previously calculated sum.

CREATE TABLE process (id int, uid uuid, process_stat_json jsonb);

INSERT INTO process VALUES
  (1, '9cf237e8-0c73-4c4d-a60c-30b244789e67', '{"Type A": {"time": 500, "complete": 100.0, "type": "Unknown"}, "Type B": {"time": 35, "complete": 100.0, "type": "Parallel"}, "Type C": {"time": 50, "complete": 100.0, "type": "Serial"}}')
, (2, '07ac957f-c9e4-460e-b75f-79c0d9c550d4', '{"Type A": {"time": 55, "complete": 100.0, "type": "Parallel"}, "Type C": {"time": 20, "complete": 100.0, "type": "Serial"}}')
, (3, '8e8ef827-fff1-4ea1-a990-206b557ef27a', '{"Type C": {"time": 100, "complete": 100.0, "type": "Parallel"}, "Type A": {"time": 55, "complete": 100.0, "type": "Unknown"}, "Type D": {"time": 70, "complete": 100.0, "type": "Serial"}}')
, (4, '5279e6ae-f2e8-4fae-8d0b-9297b5e2936a', '{"Type A": {"time": 160, "complete": 100.0, "type": "Serial"}, "Type D": {"time": 60, "complete": 100.0, "type": "Unknown"}, "Type C": {"time": 80, "complete": 100.0, "type": "Serial"}}')
;

So, for the above data item 4 : Sum = 160+60+80=300, "Unknown" time is 60, which would result in a 20% result when compared to the sum, so that would be returned as a result.

I’ve tried the following query, but its not getting the into the object to get the time field to sum it correctly:

SELECT p.uid, jsonb_build_object('time', sum((process_stat_json->'time')::numeric))
FROM process p
GROUP BY uid;

fiddle

I’ve previously asked about this dataset – and am now expanding this query to try and get more out of it

2

Answers


  1. You can extract data using dynamic json path and aggregate it per requirement

    with types as(
           select distinct jsonb_object_keys(process_stat_json) tn
           from process
    )
    select p.id, p.uid, coalesce(sum(case when ttype='"Unknown"' then ttime end),0)/ sum(ttime)  p
    from  process p
    cross join lateral(
      select ttime::decimal ttime, ttype::varchar(20)
      from (
         select p.process_stat_json#>array[types.tn,'time'] ttime, p.process_stat_json#>array[types.tn,'type'] ttype
         from types
      ) t 
      where ttype is not null
    ) t
    group by p.id, p.uid
    

    db<>fiddle

    Login or Signup to reply.
  2. The "old school" way – first flatten process_stat_json field and calculate the sum of time attributes with a window function (t CTE) and then do a routine selection.

    with t as
    (
     select id, uid, j.val,
           sum((j.val ->> 'time')::numeric) over (partition by uid) as s 
     from process
     cross join lateral jsonb_each (process_stat_json) as j(k, val)
    )
    select id, uid, val from t 
    where val ->> 'type' = 'Unknown' and (val ->> 'time')::numeric > 0.1 * s ;
    
    id uid val
    4 5279e6ae-f2e8-4fae-8d0b-9297b5e2936a {"time": 60, "type": "Unknown", "complete": 100.0}
    3 8e8ef827-fff1-4ea1-a990-206b557ef27a {"time": 55, "type": "Unknown", "complete": 100.0}
    1 9cf237e8-0c73-4c4d-a60c-30b244789e67 {"time": 500, "type": "Unknown", "complete": 100.0}

    DB-fiddle

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search