Data Table
Name Company Continent country state district
Tom HP Asia India Assam Kdk
George SAP Africa Sudan Chak ksk
Bill EBAY Europe Denmark Lekh Sip
Charles WM Asia India Haryana Jhat
Chip WM Asia India Punjab Chista
Chia WM Asia India Punjab Mast
Rule Table
Continent country state district Pass
Asia India ALL ALL Yes
Asia India Punjab ALL NO
Asia India Punjab Mast Yes
I have two tables in Hive. Depending on the rule I have to filter out the data in the data table.
In the rule table there is a column called pass which determines whether a record in data table needs to be filtered or not.
In this example there are different kinds of rules. They are the ones at broader level and at narrow level.
The rules at narrow level should not affect the rules at broader level. This means the rules at narrow level is an exception to rules at broader level.
For ex: in the rules table, there are 3 records. The first record is the rule at broader level. The other ones are at narrow level.
The first rules says to pass all the records that have country as india,state as any/all and district as any/all.
The second rule says to not pass all the records that have country as India, state as punjab and district as any/all.
The third rule says to pass all records that have country as India,state as punjab and district as Mast.
The second rule is an exception to first rule. The third rule is an exception to second rule.
Considering the data in the data table and rules in the rules table, the pass columns will be as follows for the Indian(country) records.
Name Company Continent country state district Pass
Tom HP Asia India Assam Kdk Yes
Charles WM Asia India Haryana Jhat Yes
Chia WM Asia India Punjab Mast Yes
Chip WM Asia India Punjab Chista No
This is just an example. In production the data will be different.
How do I implement this using SQL/Sql script?
Help is much appreciated.
2
Answers
You want the most specific rule. In Hive, you can use multiple
left join
s:You might want to continue with the
LEFT JOIN
s if'ALL'
is allowed forcontinent
andcountry
.@TomG : Please see the below code if that helps