Postgresql - merge duplicate rows where one row is null

x89
August 21, 2024
100 views
0 votes
2 Answers

I am using AWS Redshift. I have a table that looks like this:

Id             key             value1            value2        value3       
1              xxx             A                 NULL           NULL        
1              xxx             NULL              B              NULL
2              uuu             NULL              NULL           C

If the id is repeating, the entire row will be the same expect value1, value2 and value3. For these three columns, only one will be filled at a time. How can I merge the these rows so that my final result looks like this?

Id             key             value1              value2         value3
1              xxx              A                  B              NULL
2              uuu              NULL               NULL           C

If there’s a value for these 3 columns, I want to use it. Otherwise, I want to let the value be NULL.

When I try this:

Select 
id, key, a, b, 
max(value1) as value1,
max(value2) as value1,
 max(value3) as value3
from my_table group by id, key, a, b

I get this error, probably due to the NULLs.

failed to find conversion function from "unknown" to text

What else could I try to achieve my desired result?

Edit:

with test as (
   id,
   key,
   'business_vault' as a,
   b
 from my_table
)
select 
  a
 from test group by a

Answers

- Belayer
- August 20, 2024 at 6:36 pm
- 0 votes
0
You do not have columns a nor b, as presented thase are values. Now you could select those but would have to literal values by enclosing then in single quotes (‘).. But I do not think that is what you are after. I think you get what you want by just removing them. So: (see demo)
```
Select id, key
     , max(value1) as value1
     , max(value2) as value1
     , max(value3) as value3
  from my_table 
 group by id, key;
```
Login or Signup to reply.

If we add columns a and b to your sample data in a way that those columns have different values in all rows then your sample data would be:

--  S a m p l e    D a t a : 
Create Table tbl AS 
Select 1 as id, 'xxx' as key, 'A' as value1, NULL as value2, NULL as value3, 'a_1' as a, 'b_1' as b Union All
Select 1,       'xxx',        NULL,          'B',            NULL,           'a_2',      'b_2'      Union All
Select 2,       'uuu',        NULL,          NULL,           'C',            'a_3',      'b_3';

… you can get the expected result (from the question) simply by using Max() aggregate function and Group By id, key …

--    S Q L : 
Select   id, key,
         Max(value1) as value1,
         Max(value2) as value2,
         Max(value3) as value3
From     tbl 
Group By id, key
Order By id, key

/*    R e s u l t : 
id  key value1  value2  value3
--  --- ------  ------  ------
1   xxx A       B       null
2   uuu null    null    C      */

… however, if you need the a and b columns included in the resultset (and a and b have different values for the same id and key) then there will be no reduction of duplicate id rows. If you need to select the a and b columns in an aggregated query then those columns (a, b) should be either aggregated or part of group by clause. …

-- a and b columns in Select and Group By clauses
Select   id, key, a, b,
         Max(value1) as value1,
         Max(value2) as value2,
         Max(value3) as value3
From     tbl 
Group By id, key, a, b
Order By id, key

/*    R e s u l t :
id  key  a      b      value1  value2  value3
--  ---  -----  -----  ------  ------  ------
1   xxx  a_1    b_1    A       null    null
1   xxx  a_2    b_2    null    B       null      
2   uuu  a_3    b_3    null    null    C      */

… the question is what you want to do with the data in columns a and b when you reduce number of rows to show id, key combination just once. One option could be use analytic (window) functions to create a list of a and b column values – STRING_AGG() Over(), and Max() Over() for values 1,2,3 showing just Distinct rows …

Select   Distinct id, key, 
         STRING_AGG(a, ', ') Over(Partition By id, key) as a, 
         STRING_AGG(b, ', ') Over(Partition By id, key) as b, 
         Max(value1) Over(Partition By id, key) as value1,
         Max(value2) Over(Partition By id, key) as value2,
         Max(value3) Over(Partition By id, key) as value3
From     tbl 
Order By id, key

/*    R e s u l t : 
id  key  a          b         value1  value2  value3
--  ---  ---------  --------  ------  ------  ------
1   xxx  a_1, a_2   b_1, b_2  A       B       null
2   uuu  a_3        b_3       null    null    C      */

… as a specific context – if columns a and b have the same value for the same id, key combination then the first query from this answer could be widened with a and b column in Select and Group By clauses still getting reduced duplicate rows …

WITH    --  N e w    S a m p l e    D a t a :
  tbl AS
    ( Select 1 as id, 'xxx' as key, 'A' as value1, NULL as value2, NULL as value3, 'a_1' as a, 'b_1' as b Union All
      Select 1,       'xxx',        NULL,          'B',            NULL,           'a_1',      'b_1'      Union All
      Select 2,       'uuu',        NULL,          NULL,           'C',            'a_3',      'b_3'
    )

--    S Q L :
Select   id, key, a, b, 
         Max(value1) as value1,
         Max(value2) as value2,
         Max(value3) as value3
From     tbl 
Group By id, key, a, b
Order By id, key

/*    R e s u l t :
id  key  a    b    value1  value2  value3
--  ---  ---  ---  ------  ------  ------
1   xxx  a_1  b_1  A       B       null
2   uuu  a_3  b_3  null    null    C        */

See the fiddle here.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – merge duplicate rows where one row is null

Answers