I have created a user-defined function in Python in my Snowflake db to parse specific JSON key.
create or replace function get_tag(tag_json string, tag_key STRING)
RETURNS string
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('json5')
HANDLER = 'get_tag_py'
AS
$$
import json
def get_tag_py(json_data, key):
data = json.loads(json_data)
value = data.get(key)
return value
$$;
It works as expected but the performance is quite slow. I converted the function to JavaScript as shown below:
CREATE OR REPLACE FUNCTION get_tag_js(json_data string, key string)
RETURNS string
LANGUAGE JAVASCRIPT
AS
$$
var json_data, key;
var parsedData = JSON.parse(json_data);
value = parsedData[key];
return value;
$$
;
It fails at runtime with the following error:
select get_tag_js(json_column, 'MY_KEY') from my_orders_tbl;
JavaScript execution error: Uncaught SyntaxError: "undefined" is not valid JSON in GET_TAG_JS at 'undefined' position 0
Can someone please tell me what I am missing here?
3
Answers
I removed the “var json_ data, key;” line and made “json_data” and “key” parameters in the function definition and everywhere else in upper case letters only.
It works as expected. Now I need to mod it to take a key regardless of case.
Answer to your Question:
Your function should not declare the same named parameters, this is why the are undefined. Also the doc’s state the parameters will be UPPER CASE, because that is how all tokens are treated in SQL, thus in your JS they need to be upper case:
and now we can use those
If you really want lower case parameters in the function (that should not exist as shown (but this could be a toy example)) then use quotes to force them to lower case like:
Case insensitive, can be achieved by pulling the object apart and doing a case insensitive match. Which is more likely still faster than the exact same steps in JavaScript/Python
There is no need to create a UDF to parse JSON in Snowflake. For simple JSON, you may query the key directly
For more complex JSON, you may need to use FLATTEN. See Querying Semi-structured Data for more information.