skip to Main Content

I have two tables and I want to merge them.
Fiddle here: http://sqlfiddle.com/#!17/999d0

Result table is what I expect to get. Table utm is a source-main table and table report contains data for utm rows. What I need:

  • Take id and utm_ from utm table and add stats from table report with proper granulation.

Example

In table utm I’ve a row: (24611609, 'myTarget', 'Media', 'Social', NULL, NULL) and in table report I’ve 2 rows:

(24611609, '2022-08-01', 200, 150, 15, 'myTarget', 'Media', 'Social', 'premium', 'subcribe'),
(24611609, '2022-08-01', 25, 10, 1, 'myTarget', 'Media', 'Social', 'free', 'subcribe')

Common is: 'myTarget', 'Media', 'Social'

Proper granularity level is id, utm_campaign, utm_source, utm_medium, so I need to SUM and GROUP two rows by these keys. So for this I need something like this:

SELECT 
utm.row_id AS id,
utm.utm_campaign,
utm.utm_source,
utm.utm_medium,
utm.utm_content,
utm.utm_term,
report.date_of_visit,
sum(report.sessions) as sessions,
sum(report.pageviews) as pageviews,
sum(report.bounces) as bounces
FROM utm
inner join report on utm.row_id = report.id and utm.utm_campaign = report.utm_campaign and utm.utm_source = report.utm_source and utm.utm_medium = report.utm_medium
group by utm.row_id,
utm.utm_campaign,
utm.utm_source,
utm.utm_medium,
utm.utm_content,
utm.utm_term,
report.date_of_visit

I don’t know how to deal with all possible granularity combinations. My idea was just use diffrent JOINS variations and merge results with UNION like:

join on id, utm_campaign
union
...
join on id, utm_campaign, utm_medium
union
...
join on id, utm_campaign, utm_source
...

But it’s really stupid, I should create > 1000 unions and joins.

Any tips?

2

Answers


  1. Enjoy

    with 
    r as
    (
        select      id
                   ,date_of_visit
                   
                   ,sum(sessions)               as sessions
                   ,sum(pageviews)              as pageviews
                   ,sum(bounces)                as bounce
                   
                   ,coalesce(utm_campaign ,'')  as utm_campaign 
                   ,coalesce(utm_source   ,'')  as utm_source 
                   ,coalesce(utm_medium   ,'')  as utm_medium 
                   ,coalesce(utm_content  ,'')  as utm_content
                   ,coalesce(utm_term     ,'')  as utm_term   
                 
        from        report as r
    
        group by    id
                   ,date_of_visit
                   ,cube(6, 7, 8, 9, 10)
    
    )         
    select  r.*
    
    from            r 
    
            join    utm as u 
            
            on      r.id = u.row_id
            
                and (r.utm_campaign, r.utm_source, r.utm_medium, r.utm_content, r.utm_term)
                    is not distinct from 
                    (u.utm_campaign, u.utm_source, u.utm_medium, u.utm_content, u.utm_term)
       
    where   'NA' in (r.utm_campaign, r.utm_source, r.utm_medium, r.utm_content, r.utm_term) is not true
    
    id date_of_visit sessions pageviews bounce utm_campaign utm_source utm_medium utm_content utm_term
    28573041 2022-08-01 1000 900 10 Beeline_uppers_2022 Null Null Null Null
    24611609 2022-08-01 225 160 16 myTarget Media Social Null Null
    28573041 2022-08-01 900 885 34 shop_ smartfony my_beeline banner Null Null
    24611609 2022-08-01 1 1 0 campaign source medium content term

    Fiddle

    Login or Signup to reply.
  2. A more strightforward solution 🙂

    select      u.row_id
               ,r.date_of_visit
               
               ,sum(r.sessions)     as sessions
               ,sum(r.pageviews)    as pageviews
               ,sum(r.bounces)      as bounce
               
               ,u.utm_campaign
               ,u.utm_source  
               ,u.utm_medium  
               ,u.utm_content 
               ,u.utm_term    
             
    from                utm     as u
                join    report  as r
                on       u.row_id = r.id
                    and (u.utm_campaign = r.utm_campaign or u.utm_campaign is null)
                    and (u.utm_source   = r.utm_source   or u.utm_source   is null)
                    and (u.utm_medium   = r.utm_medium   or u.utm_medium   is null)
                    and (u.utm_content  = r.utm_content  or u.utm_content  is null)
                    and (u.utm_term     = r.utm_term     or u.utm_term     is null)
                                       
    group by    u.row_id
               ,r.date_of_visit
               ,u.utm_campaign
               ,u.utm_source  
               ,u.utm_medium  
               ,u.utm_content 
               ,u.utm_term    
    
    row_id date_of_visit sessions pageviews bounce utm_campaign utm_source utm_medium utm_content utm_term
    24611609 2022-08-01 1 1 0 campaign source medium content term
    24611609 2022-08-01 225 160 16 myTarget Media Social null null
    28573041 2022-08-01 1000 900 10 Beeline_uppers_2022 null null null null
    28573041 2022-08-01 900 885 34 shop_ smartfony my_beeline banner null null

    Fiddle

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search