Postgresql - Join two tables with almost the same fields and check when two fields are different

CristinaBiagiotti
August 23, 2023
349 views
3 votes
3 Answers

I need your help and I’ll try to explain with a simple example.

I have Table A:

NAME	CITY	AGE
Lucas	Dallas	21
John	Chicago	30
Tim	London	15

and Table B:

NAME	CITY	AGE
Lucas	Dallas	21
John	Chicago	45
Tim	London	15

I want to join these two tables and I want that, where there’s a difference (in the example, John’s age), the result is NULL.
Consider that this is a simple example, in reality I have a lot more fields and I won’t know before which values are different (so I can’t use "AGE ISNULL", because the difference could be also in CITY).
The only field that will never be different is NAME.

Thank you in advance for your help!

I tried using left join but, instead of giving me ‘NULL’ where there’s a difference, I have two lines, for example:

NAME	CITY	AGE
John	Chicago	30
John	Chicago	45

I want only one line with NULL in the age.

Answers

- MuhammadSarmad
- August 23, 2023 at 5:59 pm
- 0 votes
0
You can combine an LEFT JOIN with a CASE statement to get the desired outcome of having NULL where there is a discrepancy between the two tables and just having one line with NULL values. Here is how to go about it:
```
SELECT 
    A.NAME, 
    A.CITY, 
    CASE WHEN A.AGE = B.AGE THEN A.AGE ELSE NULL END AS AGE
FROM 
    TableA A
LEFT JOIN 
    TableB B ON A.NAME = B.NAME AND A.CITY = B.CITY;
```
Based on the matching NAME and CITY fields in both records, the LEFT JOIN will merge the rows from both tables in this query. The SELECT clause’s CASE statement determines whether the AGE values in the two tables are equal. If they match, it will show the AGE value from Table A; if not, NULL will be shown. This will result in a single line for each distinct NAME and CITY combination, with NULL appearing in the AGE column in cases when the two tables differ.
Login or Signup to reply.

- nbk
- August 23, 2023 at 6:04 pm
- 0 votes
0
You can LEFT JOIN the send table B and only select Values from A
```
SELECT A."NAME", A."CITY", NULL as AGE FROM A 
  LEFT JOIN B
  ON A."NAME" =  B."NAME" AND  A."CITY" = B."CITY" AND  A."AGE" = B."AGE"
WHERE b."NAME" IS NULL
```
NAME CITY age

John Chicago null
```
SELECT 1
```
fiddle
Login or Signup to reply.

- xQbert
- August 23, 2023 at 6:27 pm
- 0 votes
0
Instead of join use set based operations. Set based operations allow us to compare one data set to another (All of A except those in B).

Keep in mind my example will compare every record and column to every record and column and show you when differences are found in any column. One can limit the columns for evalaution by specifing the columns desired for compairison instead of the * I use.

If we can assume table A and Table B have the exact same structure and data types and in the same order, then you can minus (except) B From A giving you all the differences for A, then union the inverse to get all the differences in B.

The except operator essentially eliminates like records from the result set when where a matching record is found in the B data set (all columns must match) from A; thus showing us differences in set A not in B. We do the inverse B except A and get differences in B not A and then union the results together. to get when a differnce exists on a record for the desired columns defined.
```
(SELECT * from Table A  --Subsitite your columsn for compairison instead of * in all of the 4 * cases.
EXCEPT
SELECT * FROM table B)

UNION ALL
(
SELECT * FROM TABLE B
EXCEPT
SELECT * FROM TABLE A)
```
Alternatively you could union the two tables together A and B
Group by the fields that you want to compare and get a count. Then return only those having a count = 1; though this assumes no duplicate records would exist in either data base set(A,B); though that could be manged with a subquery and a select distinct.

Something like…
```
With CTEA as 
(SELECT DISTINCT * From A),
CTEB as (SELECT DISTINCT * FROM B)
SELECT * FROM (
SELECT * FROM CTEA
UNION 
SELECT * FROM CTEB)
GROUP BY (Fields for compare)
HAVING COUNT =1 
```
Those with a count of 2 would have records in both A and B and we don’t want those. THe distinct used in the CTE’s would ensure we don’t evalute duplicates in a single set; which would cause our count to be 2 on such records.

More on set based operations with posgresql

More on set based operations in general
Login or Signup to reply.