skip to Main Content

I am trying to query a table in Snowflake that is built with a single field labeled "Value" and each row is a separate JSON object.

Row 1: {
  "artist": {
    "artistID": "artist_1",
    "artistName": "Roblox"
  },
  "descriptors": {
    "styles": [
      {
        "ID": "84121",
        "weight": "63"
      },
      {
        "ID": "83983",
        "weight": "14"
      }
    ],
  "duration": "240509",
   "productCodes": [
    {
      "type": "ISRC",
      "value": "isrc_1"
    }
  ]
}

Row 2: {
  "artist": {
    "artistID": "artist_2",
    "artistName": "Minecraft"
  },
  "descriptors": {
    "styles": [
      {
        "ID": "84122",
        "weight": "12"
      },
      {
        "ID": "83983",
        "weight": "14"
      }
    ],
  "duration": "400001",
   "productCodes": [
    {
      "type": "ISRC",
      "value": "isrc_2"
    }
  ]
}

Row 3: {
  "artist": {
    "artistID": "artist_3",
    "artistName": "Fortnite"
  },
  "descriptors": {
    "styles": [
      {
        "ID": "84121",
        "weight": "47"
      },
      {
        "ID": "83983",
        "weight": "14"
      }
    ],
  "duration": "300001",
  "productCodes": [
    {
      "type": "ISRC",
      "value": "isrc_3"
    },
    {
      "type": "ISRC",
      "value": "isrc_4"
    }
  ]
}

What I am trying to do is SELECT a column which includes all of the ISRCs, and another column which includes their associated styles. As you can see, some rows can have multiple ISRCs, and each row can have multiple styles. The output dataframe should look like this:

enter image description here

I’m having a hard time wrapping my head around the nested arrays – can you point me in the right direction? Thank you!

2

Answers


  1. this worked for me :

    WITH json_data AS (
      SELECT 
        PARSE_JSON('{
          "artist": {
            "artistID": "artist_1",
            "artistName": "Roblox"
          },
          "descriptors": {
            "styles": [
              {
                "ID": "84121",
                "weight": "63"
              },
              {
                "ID": "83983",
                "weight": "14"
              }
            ],
            "duration": "240509",
            "productCodes": [
              {
                "type": "ISRC",
                "value": "isrc_1"
              }
            ]
          }
        }') AS data
    )
    
    SELECT 
      pc.value:value::string AS isrc,
      s.value:ID::string AS style_id,
      s.value:weight::string AS style_weight
    FROM json_data jd,
      LATERAL FLATTEN(input => jd.data:descriptors:productCodes) pc,
      LATERAL FLATTEN(input => jd.data:descriptors:styles) s
    WHERE pc.value:type::string = 'ISRC';
    

    Output

    enter image description here

    Login or Signup to reply.
  2. So using a CTE just like Ahmed:

    with cte_data(value) as (
        select parse_json($1) from values
        ('{"artist":{"artistID":"artist_1","artistName":"Roblox"},"descriptors":{"duration":"240509","productCodes":[{"type":"ISRC","value":"isrc_1"}],"styles":[{"ID":"84121","weight":"63"},{"ID":"83983","weight":"14"}]}}'),
        ('{"artist":{"artistID":"artist_2","artistName":"Minecraft"},"descriptors":{"duration":"400001","productCodes":[{"type":"ISRC","value":"isrc_2"}],"styles":[{"ID":"84122","weight":"12"},{"ID":"83983","weight":"14"}]}}'),
        ('{"artist":{"artistID":"artist_3","artistName":"Fortnite"},"descriptors":{"duration":"300001","productCodes":[{"type":"ISRC","value":"isrc_3"},{"type":"ISRC","value":"isrc_4"}],"styles":[{"ID":"84121","weight":"47"},{"ID":"83983","weight":"14"}]}}')
    )
    select * from cte_data
    

    this really could be a table, but a CTE makes no clean-up.

    so that is your example data, and you are JOIN to projections from that data together with an implicit CROSS JOIN that is what the comma after the FROM table name does.

    I explain this, because you mention, your code get you null:

    because if I use the SQL you put into the comment against, CTE of your data it works:

    with cte_data(data) as (
        select parse_json($1) from values
        ('{"artist":{"artistID":"artist_1","artistName":"Roblox"},"descriptors":{"duration":"240509","productCodes":[{"type":"ISRC","value":"isrc_1"}],"styles":[{"ID":"84121","weight":"63"},{"ID":"83983","weight":"14"}]}}'),
        ('{"artist":{"artistID":"artist_2","artistName":"Minecraft"},"descriptors":{"duration":"400001","productCodes":[{"type":"ISRC","value":"isrc_2"}],"styles":[{"ID":"84122","weight":"12"},{"ID":"83983","weight":"14"}]}}'),
        ('{"artist":{"artistID":"artist_3","artistName":"Fortnite"},"descriptors":{"duration":"300001","productCodes":[{"type":"ISRC","value":"isrc_3"},{"type":"ISRC","value":"isrc_4"}],"styles":[{"ID":"84121","weight":"47"},{"ID":"83983","weight":"14"}]}}')
    )
    SELECT 
        pc.value:value::string AS isrc, 
        s.value:ID::string AS style_id, 
        s.value:weight::string AS style_weight 
    FROM cte_data as jd, 
    LATERAL FLATTEN(input => jd.data:descriptors:productCodes) as pc, 
    LATERAL FLATTEN(input => jd.data:descriptors:styles) as s 
    WHERE pc.value:type::string = 'ISRC' ; 
    
    ISRC STYLE_ID STYLE_WEIGHT
    isrc_1 84121 63
    isrc_1 83983 14
    isrc_2 84122 12
    isrc_2 83983 14
    isrc_3 84121 47
    isrc_3 83983 14
    isrc_4 84121 47
    isrc_4 83983 14

    so to debug this I would first check pc is flatten’ing correctly:

    SELECT 
        pc.value:value::string AS isrc
    FROM cte_data as jd, 
    LATERAL FLATTEN(input => jd.data:descriptors:productCodes) as pc
    

    enter image description here

    if you get for your example data, 4+ row of null, you are pulling pc.value apart incorreclty.

    if you get zero rows, the data you are feeding into the flatten is wrong.

    if you get four perfect rows as my picture, you can now swap to check the values getting pulled apart for s.

    another thing you can do to check what way the data is, is to cast jd.data to text and the push into parse_json and see if it then works.

    SELECT 
        pc.value:value::string AS isrc
    FROM cte_data as jd, 
    LATERAL FLATTEN(input => parse_json(jd.data::text):descriptors:productCodes) as pc
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search