skip to Main Content

I have a table say t_records which has a JSON type column say col_json which can have value like below

{
  "1": {
    "value": null,
    "comment": null,
    "timestamp": null,
    "guidelineId": null,
    "pushedToSnowflakePipeline": "yes"
  },
  "2": {
    "value": null,
    "comment": null,
    "timestamp": null,
    "guidelineId": null,
    "pushedToSnowflakePipeline": "yes"
  },
  "3": {
    "value": null,
    "comment": null,
    "timestamp": null,
    "guidelineId": null,
    "pushedToSnowflakePipeline": "no"
  }
}

I want to query the rows which have isPushedToSnowflakePipeline=no

I tried with below query but it is not returning any records.

SELECT * FROM t_records WHERE JSON_EXTRACT(col_json, '$.*.pushedToSnowflakePipeline')='no';

Where am I going wrong with the query?

2

Answers


  1. As noted in the comments, your attempted where condition is failing because json_extract() is returning a json array that contains all of the keys at that wildcard path. An array can’t be equal to a scalar/integer value and so you get nothing returned.

    Since the requirement is to return only rows where all of the nested pushedToSnowflakePipeline keys are "yes", you can use function json_contains() on the array that is being returned by your json_extract() function.

    Consider:

    CREATE TABLE t_records (col_json JSON);
    INSERT INTO t_records VALUES ('{
      "1": {
        "value": null,
        "comment": null,
        "timestamp": null,
        "guidelineId": null,
        "pushedToSnowflakePipeline": "yes"
      },
      "2": {
        "value": null,
        "comment": null,
        "timestamp": null,
        "guidelineId": null,
        "pushedToSnowflakePipeline": "yes"
      },
      "3": {
        "value": null,
        "comment": null,
        "timestamp": null,
        "guidelineId": null,
        "pushedToSnowflakePipeline": "no"
      }
    }');
    INSERT INTO t_records VALUES ('{
      "1": {
        "value": null,
        "comment": null,
        "timestamp": null,
        "guidelineId": null,
        "pushedToSnowflakePipeline": "yes"
      },
      "2": {
        "value": null,
        "comment": null,
        "timestamp": null,
        "guidelineId": null,
        "pushedToSnowflakePipeline": "yes"
      },
      "3": {
        "value": null,
        "comment": null,
        "timestamp": null,
        "guidelineId": null,
        "pushedToSnowflakePipeline": "yes"
      }
    }');
    

    Using json_contains(json_col->'$.*.pushedToSnowflakePipelineh', '"no"') = 0 will filter out the records you aren’t interested in. -> is just shorthand for json_extract() in this case.

    SELECT col_json->'$.*.pushedToSnowflakePipeline'
    FROM t_records
    WHERE JSON_CONTAINS(col_json->'$.*.pushedToSnowflakePipeline', '"no"') = 0
    

    col_json->'$.*.pushedToSnowflakePipeline'
    -----------------------------------------
    ["yes", "yes", "yes"]
    

    Working dbfiddle here

    Login or Signup to reply.
  2. Analysis

    If this is the typical JSON that is stored, the value that will be returned from this query is an array since we iterate over multiple keys in this case ["1","2","3"].

    Your current where statement:

    JSON_EXTRACT(col_json, '$.*.pushedToSnowflakePipeline');
    

    returns, this:
    ["yes", "yes", "no"], which it is not filtering properly when you check if this value ="no"

    Solution

    If you are wanting to check if any of these values = "no", then you need to first get the array of pushedToSnowflakePipeline and then filter that array to see if the values contained are "no".

    Update your WHERE statement to this:

    JSON_CONTAINS(JSON_EXTRACT(assay_data, '$.*.pushedToSnowflakePipeline'), '"no"', "$")
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search