skip to Main Content

I am facing issue while filtering data with array
I have columns userid,event_name,attributes,ti

Attributes column have value like this

{"bool_sample":true,"array_int":[10,20,25,38],"array_string":["hello","world","i am fine"]}

My query

SELECT * FROM "events-data" CROSS JOIN UNNEST(CAST(json_extract(attributes, '$["array_int"]') AS array<int>)) AS t(val)
WHERE val > 9; 

This query filtering data but it giving me multiple row for same record, for above attributes column record it giving me 4 rows

userid,event_name,attributes,ti,val
test-userid,test-event,test-attributes,10
test-userid,test-event,test-attributes,20
test-userid,test-event,test-attributes,25
test-userid,test-event,test-attributes,38

I do not need multiple rows

2

Answers


  1. No need CROSS JOIN with UNNEST extracted data. Check exists for.

    Try

    SELECT * FROM "events-data" 
    WHERE exists (
        select val 
        from UNNEST(CAST(json_extract_path(attributes, '$.["array_int"]') AS int[])) AS t(val) 
        where val>9)
    ; 
    
    Login or Signup to reply.
  2. That is what UNNEST does – it expands array in multiple rows (one row per array element, see the docs).

    If the goal is to fetch rows with array_int containing values more than 9 then you can use JSON functions, for example json_exists:

    -- sample data
    WITH dataset(userid,event_name,attributes) as (
        values ('test-userid', 'test-event', '{"bool_sample":true,"array_int":[10,20,25,38],"array_string":["hello","world","i am fine"]}')
    )
    
    -- query
    select *
    from dataset
    where json_exists(attributes, 'lax $.array_int[*]?(@ > 9)');
    

    Output:

    userid event_name attributes
    test-userid test-event {"bool_sample":true,"array_int":[10,20,25,38],"array_string":["hello","world","i am fine"]}

    If you need to get the array itself with filtered data then you can use the casting+filtering:

    -- query
    select userid, event_name, vals
    from (
        select *, filter(cast(json_extract(attributes, '$.array_int') as array(int)), v -> v > 9) vals
        from dataset)
    where cardinality(vals) > 0;
    

    Output:

    userid event_name vals
    test-userid test-event {10,20,25,38}
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search