skip to Main Content

In SQL, I have a table that looks like this:

enter image description here

I wanted to combine the values from ‘Value_New’ per ‘id_Parent_Record’ according to the ‘Seq_Number’ which means it should always starts from 1.

As an example, 637454609993200 has total of 4 movements for ‘Field_Changed’ = ‘Task_Workflow’ .

Expected output should be exactly like this:

Initial Referral -> IRU -> Adjustment Approved -> Settlement

This is the code where I am currently at:

select ac."id_Parent_Record"
,ac."Field_Changed"
,ac."Value_Previous"
,ac."Value_New"
,ac."Changed_Date" 
,row_number() over (partition by "id_Parent_Record", "Field_Changed"  order by "Changed_Date" asc) as "Seq_Number"
from public.api_changelog ac
where ac."Field_Changed" IN ('_id_Matter','Task_Workflow')
and ac."id_Parent_Record" = '637454609993200';

Expected output should have a new field named ‘Transitions’ which values should look like this per row for the account 637454609993200, which Field_Change is equal to Task_Workflow:

Initial Referral -> IRU -> Adjustment Approved -> Settlement

2

Answers


  1. The STRING_AGG function should be able to accomplish this.

    Each Value_New should be delimited by ' -> ' and it should be ordered by the Changed_Date field. A GROUP BY will need to be included.

    select ac."id_Parent_Record"
    ,ac."Field_Changed"
    , STRING_AGG(ac."Value_New", ' -> ' ORDER BY ac."Changed_Date" asc) AS "Transitions"
    from public.api_changelog ac
    where ac."Field_Changed" IN ('_id_Matter','Task_Workflow')
    and ac."id_Parent_Record" = '637454609993200'
    GROUP BY ac."id_Parent_Record"
    ,ac."Field_Changed"
    ;
    

    Output:

    id_Parent_Record Field_Changed Transitions
    637454609993200 _id_Matter 637417579791350
    637454609993200 Task_Workflow Initial Referral -> IRU -> Adjustment Approved -> Settlement

    A CTE could be used to rejoin the table back to the original result set

    WITH transitions AS (
        select ac."id_Parent_Record"
        ,ac."Field_Changed"
        , STRING_AGG(ac."Value_New", ' -> ' ORDER BY ac."Changed_Date" asc) AS "Transitions"
        from public.api_changelog ac
        where ac."Field_Changed" IN ('_id_Matter','Task_Workflow')
        and ac."id_Parent_Record" = '637454609993200'
        GROUP BY ac."id_Parent_Record"
        ,ac."Field_Changed"
    )
    
    select ac."id_Parent_Record"
    ,ac."Field_Changed"
    ,ac."Value_Previous"
    ,ac."Value_New"
    ,ac."Changed_Date" 
    , t."Transitions"
    from public.api_changelog ac
    INNER JOIN transitions t
        ON ac."id_Parent_Record" = t."id_Parent_Record"
        AND ac."Field_Changed" = t."Field_Changed"
    ;
    
    id_Parent_Record Field_Changed Value_Previous Value_New Changed_Date Transitions
    637454609993200 _id_Matter 637417579791350 2021-01-07 637417579791350
    637454609993200 Task_Workflow Initial Referral 2021-01-05 Initial Referral -> IRU -> Adjustment Approved -> Settlement
    637454609993200 Task_Workflow Initial Referral IRU 2021-01-06 Initial Referral -> IRU -> Adjustment Approved -> Settlement
    637454609993200 Task_Workflow IRU Adjustment Approved 2021-03-26 Initial Referral -> IRU -> Adjustment Approved -> Settlement
    637454609993200 Task_Workflow Adjustment Approved Settlement 2021-06-02 Initial Referral -> IRU -> Adjustment Approved -> Settlement
    Login or Signup to reply.
  2. For a typical data warehouse dimension table, the data would look like this. I’m not posting as an answer for you to accept, but I’m showing how it simplifies your query. No query needed.

    surr_key (surrogate key) = Seq_Number

    nat_key (natural key) = id_Parent_Record

    surr_key   nat_key   Task_Workflow         _id_Matter    Transition
    1          200       Initial Referral      350           Initial Referral
    2          200       IRU                   350           Initial Referral > IRU
    3          200       Adjustment Approved   350           Initial Referral > IRU > Adjustment Approved
    4          200       Settlement            350           Initial Referral > IRU > Adjustment Approved > Settlement
    

    One drawback from this data model compared to your data model is that when a column changes, a new column is added or a column is deleted in the operational database (upstream), the dimension table above would need to be rebuilt and data moved over with the new DDL.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search