I have a table similar to this, where I need to pull the value out of each JSON field that has an id equal to 2. I’ve tried multiple examples, but cannot manage to wrap my head around how to manipulate json_tuples / lateral views / explodes into my scenario.
value1,value2,value3,"[{""id"":1,""value"":""x""},{"id"":2,""value"":""y""}, {"id"":3,""value"":""blah""}]"
value4,value5,value6,"[{""id"":1,""value"":""a""},{"id"":2,""value"":""b""}, {"id"":3,""value"":""blahblah""}]"
col1 | col2 | col3 | id2value |
---|---|---|---|
value1 | value2 | value3 | y |
value4 | value5 | value6 | b |
Examples Ive tried to manipulate:
How to extract selected values from json string in Hive
- I don’t know how to check for a value within the sub-row
Hive Sql Query To get Json Object from Json Array
-dont know how to combine this with selecting the normal columns
2
Answers
Your example is a little unclear, so I’m making some guesses and assumptions (the value* strings are in separate columns, your json string is stored in an array column,… And your quotes in your json array are kind of a mess, that’s not valid json.)
You can use lateral view on your json array (technically an array of structs), and then you can filter on that.
EDIT:
If your json is just stored as a string array(of valid json), you have to go through some extra hoops. You convert it to an array with your regex and explode. Then you can pull the individual elements out using json_tuple. If there’s a better/less ugly way to do this, I don’t know it.
I’ve Declared a table with same values as you said :
and after that I read all data of col1 , col12 , col13 as one JSON string and select the values with id=2
the final result is :
y
b