skip to Main Content

I have a text field in a table that contains JSON data as well as XML data. As I want to work with XML data only if it’s valid XML, I want a way to make sure I can cast the string as XML without producing an error when '{"key":"val"}'::XML is possible.

Basically I want a function select isxml('{"key":"val"}) to return false, and select isxml('<key>1</key>') to be true.

I checked existing Postgres functions such as xml_is_well_formed, but they still return true when checking JSON strings. Maybe I can catch the error and deal with it in exceptions after a bad cast? Is there a good way to do this?

2

Answers


  1. Chosen as BEST ANSWER

    My temporary solution is as follows

    CREATE OR REPLACE FUNCTION isjson(p_json text)
     RETURNS integer
     LANGUAGE plpgsql
     IMMUTABLE
    AS $function$
    begin
      perform (p_json::json is not null);
      return 1;
    exception 
      when others then
         return 0;  
    end;
    $function$;
    
    CREATE OR REPLACE FUNCTION isxml(p_xml text)
     RETURNS boolean
     LANGUAGE plpgsql
     IMMUTABLE
    AS $function$
    BEGIN
        PERFORM (p_xml::XML IS NOT NULL);
        IF (xml_is_well_formed(p_xml) 
            AND NOT (CASE WHEN isjson(p_xml) = 1 THEN TRUE ELSE false END) 
            AND (SELECT p_xml ~ '^<.*>') )THEN      -- regex matches <>, this may have uncovered edge cases
            RETURN true;
        ELSE 
            RETURN false;
        END IF;
    
    EXCEPTION 
        WHEN OTHERS THEN
            RETURN false;  
    END;
    $function$;
    

    Note: my isjson function returns integer due to other legacy compatibility reasons, it would be easier to use boolean for this specific case. This should rule out most problematic cases but have lots of limitations in the regex used, accepting suggestions for improvement.


  2. One possibility would be to use the xml_is_well_formed together with a function that checks whether or not the text content is a valid json. I.e.:

    create or replace function is_valid_json(content text)
      returns boolean
    as
    $$
    begin
      return (content::json is not null);
    exception 
      when others then
         return false;  
    end;
    

    And in your query, you do [...] xml_is_well_formed(content) and not is_valid_json(content) [...].

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search