skip to Main Content

I have a list of countries, and for each country I have a table like this (first column: user, second column: user_type, third column: date):

user user_type date
user1 National 2022-10-01
user1 National 2022-10-01
user2 National 2022-10-01
user2 International 2022-10-01
user3 National 2022-10-02
user1 Unknown 2022-10-02
user1 National 2022-10-03

I would like to get all the distinct users of each type, in one row per day, like this (4 columns: date, first_user_type, second_user_type, third_user_type):

date first_user_type second_user_type third_user_type
2022-10-01 2 1 0
2022-10-02 1 0 1
2022-10-03 1 0 0

However, If I run this query:

SELECT 
date, user_type, COUNT(distinct user) as num_users
FROM "country"."table" 
WHERE 
date between '2022-10-01' AND '2022-10-03' 
GROUP BY date, user_type 
ORDER BY date, user_type

Then I obtain the correct results but with the same date in different rows, like this:

date user_type num_users
2022-10-01 National 2
2022-10-01 International 1
2022-10-01 Unknown 0
2022-10-02 National 1
2022-10-02 International 0
2022-10-02 Unknown 1
2022-10-03 National 1
2022-10-03 International 0
2022-10-03 Unknown 0

But there is an additional difficulty: there are 3 user_types globally: "National", "International" and "Unknown", but the problem is that some countries only have "National", or "National" and "Unknown" (no "International"), and I would like the result to appear with a 0 (even if that user_type do not exist in that country).
The query should be the same for all conutries, only changing "country".

(P.S: the query should be run into AWS Athena)

Any ideas?

2

Answers


  1. Conditional aggregation should do it here:

    SELECT date_, 
           COUNT(DISTINCT IF(user_type = 'National'     , user_, NULL)) AS numNational,
           COUNT(DISTINCT IF(user_type = 'International', user_, NULL)) AS numInternational,
           COUNT(DISTINCT IF(user_type = 'Unknown'      , user_, NULL)) AS numUnknown
    FROM tab
    GROUP BY date_
    

    Check the demo here.

    Login or Signup to reply.
  2. SELECT date,
           SUM(CASE WHEN user_type = 'National' THEN 1 ELSE 0 END) AS National,
           SUM(CASE WHEN user_type = 'International' THEN 1 ELSE 0 END) AS International,
           SUM(CASE WHEN user_type = 'Unknown' THEN 1 ELSE 0 END) AS Unknown
    FROM "country"."table"
    WHERE date BETWEEN '2022-10-01' AND '2022-10-03'
    GROUP BY date
    

    I hope this helps!

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search