skip to Main Content

I need to get list of transactions with list of logs and tags inside each transaction and the most important make search by field in logs and tags and unite it all.
Example below is search by field1 and tag1. How to do that? Also important: I have 500m+ rows. If using WHERE id IN .. will it be slow?

transactions
 id int
 hash var

id | hash 
------------
1  |  h1
2  |  h2
3  |  h3
logs
 transaction_id int
 field1 var
 value var

transaction_id | field1 | value 
-------------------------------
1              | f1     | v1
1              | f2     | v2
2              | f3     | v3
3              | f4     | v4
tags
 transaction_id int
 tag1 var
 value var

transaction_id | tag1   | value 
-------------------------------
1              | t1     | v1
2              | t2     | v2
2              | t3     | v3

I need result:

id | hash | logs_array                                                         | tags_array
------------------------------------------------------------------------------------------
1  | h1   | [{'field1': 'f1', 'value': 'v1'}, {'field1': 'f2', 'value': 'v2'}] | [{'tag1': 't1', 'value': 'v1'}]
2  | h2   | [{'field1': 'f3', 'value': 'v3'}]                                  | [{'tag1': 't2', 'value': 'v2'}, {'tag1': 't3', 'value': 'v3'}]
3  | h3   | [{'field1': 'f4', 'value': 'v4'}]                                  | []

Example template I can think of:

SELECT t.id, t.hash, array_agg(..) FROM transactions t WHERE id in
    (
        SELECT transaction_id FROM logs WHERE field1={any_string}
    )
    OR id in
    (
        SELECT transaction_id FROM tags WHERE tag1={any_string}
    )
LEFT JOIN logs lo ON t.id = lo.transaction_id
LEFT JOIN tags ta ON t.id = ta.transaction_id

2

Answers


  1. That could be accomplished with a query like this

    select t.id, t.hash, 
    array_agg (distinct jsonb_strip_nulls(jsonb_build_object('field1', l.field1, 'value', l."value"))) logs_array,
    array_agg (distinct jsonb_strip_nulls(jsonb_build_object('tag1', tg.tag1, 'value', tg."value"))) tags_array
    from transactions t 
    left join logs l on t.id = l.transaction_id 
    left join tags tg on t.id = tg.transaction_id
    group by t.id, t.hash 
    order by t.id;
    

    Another version using conditions and stripping nulls

    select t.id, t.hash, 
    array_agg (distinct jsonb_strip_nulls(jsonb_build_object('field1', l.field1, 'value', l."value"))) logs_array,
    array_agg (distinct jsonb_strip_nulls(jsonb_build_object('tag1', tg.tag1, 'value', tg."value"))) tags_array
    from transactions t 
    left join logs l on t.id = l.transaction_id 
    left join tags tg on t.id = tg.transaction_id
    where l.field1 = 'f1' and tg.tag1 = 't1'
    group by t.id, t.hash
    order by t.id;
    

    Fiddle to test

    Login or Signup to reply.
  2. I think it’s preferable to do a pre-aggregation.

    See example

    SELECT t.id, t.hash, logs_array, tags_array
    FROM 
    (select transaction_id,array_agg(el1) tags_array 
      from
      (select transaction_id,JSON_BUILD_OBJECT(tag1,value) el1 FROM tags
        WHERE transaction_id in (
              SELECT transaction_id FROM tags WHERE tag1='t5' --{any_string}
          )
      )st
      group by transaction_id
    )ta 
    full join
    (select transaction_id,array_agg(el1) logs_array 
      from
      (select transaction_id,JSON_BUILD_OBJECT(field1,value) el1 FROM logs
        WHERE transaction_id in (
              SELECT transaction_id FROM logs WHERE field1='f4' --{any_string}
          )
      )sl
      group by transaction_id
    )la on la.transaction_id=ta.transaction_id
    left join transactions t on t.id=coalesce(la.transaction_id,ta.transaction_id)
    

    Demo

    Joins performance – there is no doubt about it. Surely there are suitable indexes.
    The main cost of the operation will be in the search for WHERE field1={any string}, "full table scan" if no suitable indexes.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search