skip to Main Content

I have a H2 database (version 2.2.220) with data in a JSON column; I need to migrate the column via a SQL script so that the data is in a conventional relational database structure. The main stumbling block is that on extraction of a JSONValue string value (surrounded by double quotes in JSON format), I am struggling to convert this into a VARCHAR value that does not include the surrounding double quotes.

I hope the below SQL statements illustrate the issue in the most rudimentary way:

  1. create table my_table (
        id        int generated by default as identity,
        my_json   JSON null default null,
    );
    
    insert into my_table (my_json) values (JSON '"abcdef"');
    

    The double quotes inside the string constant is necessary, else it is not recognized as valid JSON.

  2. 1st query (naive):
    select my_json from my_table;
    

    This returns "abcdef" with quotes included in the string. I would however like the raw string value abcdef to be returned without the surrounding quotes.

  3. 2nd query (typecasting):
    select CAST(my_json as VARCHAR(255)) from my_table;
    

    This still returns "abcdef" with quotes included in the string.

  4. 3rd query ("brute-force" trimming):
    select BTRIM(my_json, '"') from my_table;
    

    This achieves the desired result abcdef, but is problematic. One way this will cause a bug is when the string value should intentionally include a double quote at the beginning and/or end, e.g. xyz" (JSON '"xyz""') – this will be trimmed to return xyz.

I am wondering if there is a "recommended"/cleaner way to do this conversion that is less bug-prone.

Related/Interesting

2

Answers


  1. Im afraid that the safest method to achieve this in the H2 database is by using a user-defined function (UDF). Unfortunately, there’s no direct way to remove those quotes using H2’s built-in functions, as the H2 database doesn’t allow for querying individual JSON values directly. H2 stores JSON data as plain strings. So, when you save "abcdef" into a table (whether you use JSON '"abcdef"', or 'abcdef'::JSON), the query select my_json from my_table; always returns the entire JSON object (which is "abcdef") rather than just the value (abcdef), just like if you would query for a JSON object (for example select JSON '{"foo":"bar"}';), there is no native way to extract the bar value from the foo field.

    So if you need an individual value, I recommend creating a UDF like this:

    DROP ALIAS IF EXISTS STRIP_JSON_QUOTES;
    
    CREATE ALIAS STRIP_JSON_QUOTES AS $$
    String strip_json_quotes(String jsonString) {
        if (jsonString != null && !jsonString.isEmpty()) {
            if (jsonString.startsWith(""") && jsonString.endsWith(""") && jsonString.length() > 1) {
                return jsonString.substring(1, jsonString.length() - 1);
            }
        }
        return jsonString;
    }
    $$;
    

    And then you can call it like this:

    select strip_json_quotes(my_json) from my_table;
    

    I wouldn’t recomend using BTRIM, as it removes all leading and trailing characters, not just the outermost ones. For example, in a rare case where you want to store a value with quotes, BTRIM would remove too much:

    sql> select BTRIM('"abcdef"'::JSON, '"') as trimmed_json_value;
    TRIMMED_JSON_VALUE
    "abcdef
    (1 row, 29 ms)
    
    Login or Signup to reply.
  2. Use regexp_replace() to delete (replace with blank) quotes but only at the start or end of the value:

    select regexp_replace(my_json, '^"|"$', '')
    

    Regex breakdown:

    • ^ means "start of input"
    • | means "logical OR" (an alternation in regex speak)
    • $ means "end of input"
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search