skip to Main Content

I need help splitting the string below using REGEXP_SPLIT_TO_ARRAY in a Postgres function:

keyone="valone" key_two="val two" KEY_THREE=val three keyfour="http://sample.domain?par1=val1&par2=val2" keyFIVE=valfive

and the resultant array should look like this:

(keyone="valone",key_two="val two",KEY_THREE=val three,keyfour="http://sample.domain?par1=val1&par2=val2",keyFIVE=valfive)

I ended up with this

myarray := REGEXP_SPLIT_TO_ARRAY(mystring, E'\s(?=([^"]*"[^"]*")*[^"]*$)');

but it is failing to split when value contains space and not enclosed in double quotes (KEY_THREE=val three).

2

Answers


  1. Try the following regular expression (w+)=(?:"([^"]+)"|((?:(?!s+w+=).)+)).
    This regular expression pattern consists of two main parts:
    (w+): This part captures a word character sequence. The w represents any word character (letters, digits, or underscores), and the + indicates that one or more word characters should be captured.
    =(?:"([^"]+)"|((?:(?!s+w+=).)+)): This part matches the equal sign = followed by a value. It uses a non-capturing group (?:...) to match either a quoted string ("([^"]+)") or a non-quoted string (((?:(?!s+w+=).)+)).
    "([^"]+)": This subpattern captures a quoted string. It matches a double quote " followed by one or more characters that are not a double quote ([^"]+), and then another double quote ".
    ((?:(?!s+w+=).)+): This subpattern captures a non-quoted string. It matches one or more characters . that are not followed by a space, one or more word characters, an equal sign, and more word characters ((?!s+w+=)).
    The regex explanation is borrowed from here.

    select
      array_agg(array[
        grp[1],
        coalesce(grp[2], grp[3])
      ]) as myarray
    from regexp_matches('''keyone="valone"
    key_two="val two"
    KEY_THREE=val three
    keyfour="http://sample.domain?par1=val1&par2=val2"
    keyFIVE=valfive foo=bar boz-woz''',
          '(w+)=(?:"([^"]+)"|((?:(?!s+w+=).)+))',
          'g') as matches(grp);
    

    Results:

    +---------------------------------------------------------------------------------------------------------------------------------------------------------+
    | myarray                                                                                                                                                 |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------+
    | {{keyone,valone},{key_two,"val two"},{KEY_THREE,"val three"},{keyfour,http://sample.domain?par1=val1&par2=val2},{keyFIVE,valfive},{foo,"bar boz-woz'"}} |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------+
    

    Try it on db<>fiddle.

    Login or Signup to reply.
  2. Write a FUNCTION and then call the function using

    SELECT split_key_value_pairs('keyone="valone" key_two="val two" KEY_THREE=val three keyfour="http://sample.domain?par1=val1&par2=val2" keyFIVE=valfive');
    

    Function here

       CREATE OR REPLACE FUNCTION split_key_value_pairs(input_str text)
        RETURNS text[] AS $$
        DECLARE
            parts text[];
            part text;
            in_quote boolean := false;
            buffer text := '';
        BEGIN
            FOR i IN 1..length(input_str) LOOP
                part := substr(input_str, i, 1);
        
                IF part = '"' THEN
                    in_quote := NOT in_quote;
                END IF;
        
                IF part = ' ' AND NOT in_quote THEN
                    parts := array_append(parts, buffer);
                    buffer := '';
                ELSE
                    buffer := buffer || part;
                END IF;
            END LOOP;
        
            parts := array_append(parts, buffer); -- append last buffer
            RETURN parts;
        END;
        $$ LANGUAGE plpgsql;
    

    The Result would be

    {"keyone="valone"","key_two="val two"",KEY_THREE=val,three,"keyfour="http://sample.domain?par1=val1&par2=val2"",keyFIVE=valfive}
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search