skip to Main Content

There are 3 tables, A, B, C.

Table A has an id column, and B & C integer column which represents money and a_id which reference id in table A.

Below is an example of table A, B, C

table A

id
1
2
3

table B

id a_id money
1 1 1000
2 1 2000
3 2 3500

tableC

id a_id money
1 1 100
2 2 200
3 2 400

In this case, I wanna aggregate them like one of these.

a_id B_money_sum C_money_sum
1 2000 100
2 3500 600

or

a_id money_sum
1 2100
2 4100

I’ve tries this, but the result wasn’t expected one.
(aggregation was based on cartesian product)

select a_id, sum(b.money) B_money_sum, sum(c.money) C_money_sum
from A
join B on A.id = B.a_id
join C on A.id = C.a_id
group by a_id

Isn’t it possible only join, without subquery ?

2

Answers


  1. Isn’t it possible only join, without subquery ?

    No, because when there is more than one a_id (TableB and TableC) per id(TableA) that made a one-to-many join, which creates more rows (essentially, a duplication of rows, also known as "Relational Multiplication").

    A common approach to fix that issue is to isolate the SUM() in a separate query , subquery

    select a.id, B_money_sum , C_money_sum
    from tableA a 
    join (select a_id,sum(money) as B_money_sum
          from tableB
          group by a_id 
         ) b on a.id = b.a_id
    join (select a_id,sum(money) as C_money_sum
          from tableC
          group by a_id 
         ) c on a.id = c.a_id  ; 
    

    https://dbfiddle.uk/coZvQsuT

    Login or Signup to reply.
  2. This is a bit trickier if table b has an id which c doesn’t or vice versa. Let’s take this case for example.

    create table a (id int);
    create table b (id int,a_id int,money int);
    create table c (id int,a_id int,money int);
    insert a values(1),(2),(3),(4);
    insert b values(1,1,1000),(2,1,2000),(3,2,3500),(4,4,1000),(5,4,7000);
    insert c values(1,1,100),(2,2,200),(3,2,400),(4,3,200),(5,3,600);
    
    

    In this case, b has id 4 while c has id 3. If we use an inner join, both id 3 and id 4 will have their data missing in the result set:

    select a_id,sum_a,sum_b 
    from
        (select a_id ,sum(money) sum_a
        from a
        join b on a.id=b.a_id
        group by a_id) tb
    join
        (select a_id ,sum(money) sum_b
        from a
        join c on a.id=c.a_id
        group by a_id) tc
    using(a_id)
    ;
    
    +------+-------+-------+
    | a_id | sum_a | sum_b |
    +------+-------+-------+
    |    1 |  3000 |   100 |
    |    2 |  3500 |   600 |
    +------+-------+-------+
    
    

    However, if we use a one-sided outer join , we will still have some data missing on the other side.

    select a_id,sum_a,sum_b 
    from
        (select a_id ,sum(money) sum_a
        from a
        join b on a.id=b.a_id
        group by a_id) tb
    left join
        (select a_id ,sum(money) sum_b
        from a
        join c on a.id=c.a_id
        group by a_id) tc
    using(a_id)
    ;
    
    +------+-------+-------+
    | a_id | sum_a | sum_b |
    +------+-------+-------+
    |    1 |  3000 |   100 |
    |    2 |  3500 |   600 |
    |    4 |  8000 |  NULL |
    +------+-------+-------+
    
    select a_id,sum_a,sum_b 
    from
        (select a_id ,sum(money) sum_a
        from a
        join b on a.id=b.a_id
        group by a_id) tb
    right join
        (select a_id ,sum(money) sum_b
        from a
        join c on a.id=c.a_id
        group by a_id) tc
    using(a_id)
    ;
    
    +------+-------+-------+
    | a_id | sum_a | sum_b |
    +------+-------+-------+
    |    1 |  3000 |   100 |
    |    2 |  3500 |   600 |
    |    3 |  NULL |   800 |
    +------+-------+-------+
    
    

    In order to bridge the gap for all sides, a full join is required. But since MySQL has yet to implement a full join, a UNION is called for to combine the two outer joins.

    select * from
    (select a_id,sum_a,sum_b 
    from
        (select a_id ,sum(money) sum_a
        from a
        join b on a.id=b.a_id
        group by a_id) tb
    left join
        (select a_id ,sum(money) sum_b
        from a
        join c on a.id=c.a_id
        group by a_id) tc
    using(a_id)
    
    UNION
    
    select a_id,sum_a,sum_b 
    from
        (select a_id ,sum(money) sum_a
        from a
        join b on a.id=b.a_id
        group by a_id) tb
    right join
        (select a_id ,sum(money) sum_b
        from a
        join c on a.id=c.a_id
        group by a_id) tc
    using(a_id)
    ) t_a_b
    order by a_id
    ;
    
    +------+-------+-------+
    | a_id | sum_a | sum_b |
    +------+-------+-------+
    |    1 |  3000 |   100 |
    |    2 |  3500 |   600 |
    |    3 |  NULL |   800 |
    |    4 |  8000 |  NULL |
    +------+-------+-------+
    
    

    Note: The outmost layer is required to use the ORDER BY clause, which is ineffective in UNION.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search