skip to Main Content

enter image description here

in the example data set above, I want to assign a Primary_Store for each buyer_id based on the following criteria

  • if one store has more visit counts than any other store for that buyer, assign as primary
  • if no, then
    —- > if stores with most visits are equal, assign Primary_Store according to hierarchy of store_type: 1. Retail, 2.Online, 3.Event
    ————> if stores with most visits are equal, and store_type is the same, then assign Primary_Store as store with earliest visit date

By this logic the results should be:
Buyer ID : Primary_Store
1001: Whole Foods
1002: Farmer’s Market
1003: Costco

I have been trying to use a case when statement but i am having difficulty applying the different types of criteria at once.

2

Answers


  1. So the simple way is to use a RANKING window function of which ROW_NUMBER is one, and it does not allocate duplicate rank values.

    thus:

    SELECT 
        *
        ,decode(store_type, 'Retail', 1, 'Online', 2, 'Event', 3) as rnk2
        ,row_number() over (partition by buyer_id order by visit_count desc, rnk2, first_visit) as rn
    FROM VALUES
        (1001, 'Whole Foods', '2020-05-31'::date, 'Retail', 10),
        (1001, 'Instacart', '2020-03-15'::date, 'Online', 10),
        (1001, 'Farmer"s Market', '2020-01-24'::date, 'Event', 4),
        (1002, 'Trader Joes', '2020-02-12'::date, 'Retail', 9)
        
        t(buyer_id, store, first_visit, store_type, visit_count)
    

    for this partial copy of your data, we can use three priorities in the ranking ORDER BY

    BUYER_ID STORE FIRST_VISIT STORE_TYPE VISIT_COUNT RNK2 RN
    1001 Whole Foods 2020-05-31 Retail 10 1 1
    1001 Instacart 2020-03-15 Online 10 2 2
    1001 Farmer"s Market 2020-01-24 Event 4 3 3
    1002 Trader Joes 2020-02-12 Retail 9 1 1

    thus it can be moved to a QUALIFY like and we can take only the first value:

    SELECT 
        *
    FROM VALUES
        (1001, 'Whole Foods', '2020-05-31'::date, 'Retail', 10),
        (1001, 'Instacart', '2020-03-15'::date, 'Online', 10),
        (1001, 'Farmer"s Market', '2020-01-24'::date, 'Event', 4),
        (1002, 'Trader Joes', '2020-02-12'::date, 'Retail', 9)
        
        t(buyer_id, store, first_visit, store_type, visit_count)
    qualify row_number() over (
            partition by buyer_id 
            order by visit_count desc, 
                decode(store_type, 'Retail', 1, 'Online', 2, 'Event', 3), 
                first_visit) = 1
        ;
    

    gives:

    BUYER_ID STORE FIRST_VISIT STORE_TYPE VISIT_COUNT
    1001 Whole Foods 2020-05-31 Retail 10
    1002 Trader Joes 2020-02-12 Retail 9

    So to reflect Rajat’s answer of using FIRST_VALUE to retrieve the "primary store" as a value, which opens the idea, that you might also just want to know "is this the one". which can be done with:

    SELECT 
        *
        ,iff(row_number() over (
            partition by buyer_id 
            order by visit_count desc, 
                decode(store_type, 'Retail', 1, 'Online', 2, 'Event', 3), 
                first_visit) = 1, 'yes', 'no' ) as primary_store
    FROM your_table
    
    BUYER_ID STORE FIRST_VISIT STORE_TYPE VISIT_COUNT PRIMARY_STORE
    1001 Whole Foods 2020-05-31 Retail 10 yes
    1001 Instacart 2020-03-15 Online 10 no
    1001 Farmer"s Market 2020-01-24 Event 4 no
    1002 Trader Joes 2020-02-12 Retail 9 yes

    the basic answer, of how can you rank the data, is the same between the methods, it then comes down to how you wish to use that rank.

    Login or Signup to reply.
  2. You seem to be looking for way to assign a primary store to existing rows without collapsing them. We can modify Simeon’s solution a little bit. I swapped decode out for array_position for personal preference but you can go with either of the methods

    select *, first_value(store) over 
             (partition by buyer_id 
              order by visit_count desc, 
                       array_position(store_type::variant,['Retail','Online', 'Event']), 
                       first_visit asc) as primary_store
    from your_table;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search