Is there a way to combine values from a single field that will always follow the sequence to create all the movements per ID? - Postgresql

plae
April 7, 2023
159 views
0 votes
2 Answers

In SQL, I have a table that looks like this:

I wanted to combine the values from ‘Value_New’ per ‘id_Parent_Record’ according to the ‘Seq_Number’ which means it should always starts from 1.

As an example, 637454609993200 has total of 4 movements for ‘Field_Changed’ = ‘Task_Workflow’ .

Expected output should be exactly like this:

Initial Referral -> IRU -> Adjustment Approved -> Settlement

This is the code where I am currently at:

select ac."id_Parent_Record"
,ac."Field_Changed"
,ac."Value_Previous"
,ac."Value_New"
,ac."Changed_Date" 
,row_number() over (partition by "id_Parent_Record", "Field_Changed"  order by "Changed_Date" asc) as "Seq_Number"
from public.api_changelog ac
where ac."Field_Changed" IN ('_id_Matter','Task_Workflow')
and ac."id_Parent_Record" = '637454609993200';

Expected output should have a new field named ‘Transitions’ which values should look like this per row for the account 637454609993200, which Field_Change is equal to Task_Workflow:

Initial Referral -> IRU -> Adjustment Approved -> Settlement

Tags: postgresql sql

Answers

The STRING_AGG function should be able to accomplish this.

Each Value_New should be delimited by ' -> ' and it should be ordered by the Changed_Date field. A GROUP BY will need to be included.

select ac."id_Parent_Record"
,ac."Field_Changed"
, STRING_AGG(ac."Value_New", ' -> ' ORDER BY ac."Changed_Date" asc) AS "Transitions"
from public.api_changelog ac
where ac."Field_Changed" IN ('_id_Matter','Task_Workflow')
and ac."id_Parent_Record" = '637454609993200'
GROUP BY ac."id_Parent_Record"
,ac."Field_Changed"
;

Output:

id_Parent_Record	Field_Changed	Transitions
637454609993200	_id_Matter	637417579791350
637454609993200	Task_Workflow	Initial Referral -> IRU -> Adjustment Approved -> Settlement

A CTE could be used to rejoin the table back to the original result set

WITH transitions AS (
    select ac."id_Parent_Record"
    ,ac."Field_Changed"
    , STRING_AGG(ac."Value_New", ' -> ' ORDER BY ac."Changed_Date" asc) AS "Transitions"
    from public.api_changelog ac
    where ac."Field_Changed" IN ('_id_Matter','Task_Workflow')
    and ac."id_Parent_Record" = '637454609993200'
    GROUP BY ac."id_Parent_Record"
    ,ac."Field_Changed"
)

select ac."id_Parent_Record"
,ac."Field_Changed"
,ac."Value_Previous"
,ac."Value_New"
,ac."Changed_Date" 
, t."Transitions"
from public.api_changelog ac
INNER JOIN transitions t
    ON ac."id_Parent_Record" = t."id_Parent_Record"
    AND ac."Field_Changed" = t."Field_Changed"
;

id_Parent_Record	Field_Changed	Value_Previous	Value_New	Changed_Date	Transitions
637454609993200	_id_Matter		637417579791350	2021-01-07	637417579791350
637454609993200	Task_Workflow		Initial Referral	2021-01-05	Initial Referral -> IRU -> Adjustment Approved -> Settlement
637454609993200	Task_Workflow	Initial Referral	IRU	2021-01-06	Initial Referral -> IRU -> Adjustment Approved -> Settlement
637454609993200	Task_Workflow	IRU	Adjustment Approved	2021-03-26	Initial Referral -> IRU -> Adjustment Approved -> Settlement
637454609993200	Task_Workflow	Adjustment Approved	Settlement	2021-06-02	Initial Referral -> IRU -> Adjustment Approved -> Settlement

- Entree
- April 7, 2023 at 6:11 am
- 0 votes
0
For a typical data warehouse dimension table, the data would look like this. I’m not posting as an answer for you to accept, but I’m showing how it simplifies your query. No query needed.

surr_key (surrogate key) = Seq_Number

nat_key (natural key) = id_Parent_Record
```
surr_key   nat_key   Task_Workflow         _id_Matter    Transition
1          200       Initial Referral      350           Initial Referral
2          200       IRU                   350           Initial Referral > IRU
3          200       Adjustment Approved   350           Initial Referral > IRU > Adjustment Approved
4          200       Settlement            350           Initial Referral > IRU > Adjustment Approved > Settlement
```
One drawback from this data model compared to your data model is that when a column changes, a new column is added or a column is deleted in the operational database (upstream), the dimension table above would need to be rebuilt and data moved over with the new DDL.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Is there a way to combine values from a single field that will always follow the sequence to create all the movements per ID? – Postgresql

Answers