How to assign a primary value from Column B based on hierarchy of criteria across multiple columns (SQL) - Mysql

user18623003
November 28, 2022
252 views
1 vote
2 Answers

in the example data set above, I want to assign a Primary_Store for each buyer_id based on the following criteria

if one store has more visit counts than any other store for that buyer, assign as primary
if no, then
—- > if stores with most visits are equal, assign Primary_Store according to hierarchy of store_type: 1. Retail, 2.Online, 3.Event
————> if stores with most visits are equal, and store_type is the same, then assign Primary_Store as store with earliest visit date

By this logic the results should be:
Buyer ID : Primary_Store
1001: Whole Foods
1002: Farmer’s Market
1003: Costco

I have been trying to use a case when statement but i am having difficulty applying the different types of criteria at once.

Answers

So the simple way is to use a RANKING window function of which ROW_NUMBER is one, and it does not allocate duplicate rank values.

thus:

SELECT 
    *
    ,decode(store_type, 'Retail', 1, 'Online', 2, 'Event', 3) as rnk2
    ,row_number() over (partition by buyer_id order by visit_count desc, rnk2, first_visit) as rn
FROM VALUES
    (1001, 'Whole Foods', '2020-05-31'::date, 'Retail', 10),
    (1001, 'Instacart', '2020-03-15'::date, 'Online', 10),
    (1001, 'Farmer"s Market', '2020-01-24'::date, 'Event', 4),
    (1002, 'Trader Joes', '2020-02-12'::date, 'Retail', 9)
    
    t(buyer_id, store, first_visit, store_type, visit_count)

for this partial copy of your data, we can use three priorities in the ranking ORDER BY

BUYER_ID	STORE	FIRST_VISIT	STORE_TYPE	VISIT_COUNT	RNK2	RN
1001	Whole Foods	2020-05-31	Retail	10	1	1
1001	Instacart	2020-03-15	Online	10	2	2
1001	Farmer"s Market	2020-01-24	Event	4	3	3
1002	Trader Joes	2020-02-12	Retail	9	1	1

thus it can be moved to a QUALIFY like and we can take only the first value:

SELECT 
    *
FROM VALUES
    (1001, 'Whole Foods', '2020-05-31'::date, 'Retail', 10),
    (1001, 'Instacart', '2020-03-15'::date, 'Online', 10),
    (1001, 'Farmer"s Market', '2020-01-24'::date, 'Event', 4),
    (1002, 'Trader Joes', '2020-02-12'::date, 'Retail', 9)
    
    t(buyer_id, store, first_visit, store_type, visit_count)
qualify row_number() over (
        partition by buyer_id 
        order by visit_count desc, 
            decode(store_type, 'Retail', 1, 'Online', 2, 'Event', 3), 
            first_visit) = 1
    ;

gives:

BUYER_ID	STORE	FIRST_VISIT	STORE_TYPE	VISIT_COUNT
1001	Whole Foods	2020-05-31	Retail	10
1002	Trader Joes	2020-02-12	Retail	9

So to reflect Rajat’s answer of using FIRST_VALUE to retrieve the "primary store" as a value, which opens the idea, that you might also just want to know "is this the one". which can be done with:

SELECT 
    *
    ,iff(row_number() over (
        partition by buyer_id 
        order by visit_count desc, 
            decode(store_type, 'Retail', 1, 'Online', 2, 'Event', 3), 
            first_visit) = 1, 'yes', 'no' ) as primary_store
FROM your_table

BUYER_ID	STORE	FIRST_VISIT	STORE_TYPE	VISIT_COUNT	PRIMARY_STORE
1001	Whole Foods	2020-05-31	Retail	10	yes
1001	Instacart	2020-03-15	Online	10	no
1001	Farmer"s Market	2020-01-24	Event	4	no
1002	Trader Joes	2020-02-12	Retail	9	yes

the basic answer, of how can you rank the data, is the same between the methods, it then comes down to how you wish to use that rank.

- Radagast
- November 28, 2022 at 11:42 pm
- 0 votes
0
You seem to be looking for way to assign a primary store to existing rows without collapsing them. We can modify Simeon’s solution a little bit. I swapped decode out for array_position for personal preference but you can go with either of the methods
```
select *, first_value(store) over 
         (partition by buyer_id 
          order by visit_count desc, 
                   array_position(store_type::variant,['Retail','Online', 'Event']), 
                   first_visit asc) as primary_store
from your_table;
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

How to assign a primary value from Column B based on hierarchy of criteria across multiple columns (SQL) – Mysql

Answers