skip to Main Content

This might be difficult to explain. But Im trying to write a redshift sql query where I have want the count of organizations that fall into different market buckets. There are 50 markets. For example company x can be only be found in 1 market and company y can be found in 3 markets. I want to preface that I have over 10,000 companies to fit into these buckets. So ideally it would be more like, hypothetically 500 companies are found in 3 markets or 7 companies are found in 50 markets.

The table would like

Market Bucket Org Count
1 Markets 3
2 Markets 1
3 Markets 0

select count(distinct case when enterprise_account = true and (market_name then organization_id end) as "1 Market" from organization_facts

I was trying to formulate the query from above but I got confused on how to effectively formulate the query

Organization Facts

Market Name Org ID Org Name
New York 15683 Company x
Orlando 38478 Company y
Twin Cities 2738 Company z
Twin Cities 15683 Company x
Detroit 99 Company xy

2

Answers


  1. You would need a sub-query that retrieves the number of markets per company, and an outer query that summarises into a count of markets.

    Something like:

    with markets as (
      select
        org_name,
        count(distinct market_name) as market_count
      from organization_facts
    )
    
    select
      market_count,
      count(*) as org_count
    from markets
    group by market_count
    order by market_count
    
    Login or Signup to reply.
  2. If I follow you correctly, you can do this with two levels of aggregation. Assuming that org_id represents a company in your dataset:

    select cnt_markets, count(*) cnt_org_id
    from (select count(*) cnt_markets from organization_facts group by org_id) t
    group by cnt_markets
    

    The subquery counts the number of markets per company. I assumed no duplicate (ord_id, market_name) tuples in the table ; if that’s not the case, then you need count(distinct market_name) instead of count(*) in that spot.

    Then, the outer query just counts how many times each market count occurs in the subquery, which yields the result that you want.

    Note that I left apart the enterprise_account column ,that appears in your query but not in your data.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search