I have a PostgreSQL database with a table named ‘contacts’.
The table has a field of type ‘text’ named ‘xml_data’.
The ‘xml_data’ field contains strings of XML data that includes 0-3 email addresses.
An example of the XML data is:
`<emailInfo><typedEmail><type>E-mail</type><value>[email protected]</value></typedEmail><datas><typedEmail><type/><value>[email protected]</value><active>true</active></typedEmail></datas></emailInfo>
`<emailInfo><typedEmail><type>Mail</type><value>[email protected]</value></typedEmail><datas><typedEmail><type>Mail</type><value>[email protected]</value></typedEmail></datas></emailInfo>`
`<emailInfo><typedEmail><type/><value>[email protected]</value></typedEmail><datas><typedEmail><type/><value>[email protected]</value><active>true</active></typedEmail></datas></emailInfo>`
`<emailInfo><typedEmail><type/><value>[email protected]</value></typedEmail><datas><typedEmail><type/><value>[email protected]</value><active>true</active></typedEmail></datas></emailInfo>`
Email addresses are always enclosed within and tags.
However, sometimes the XML string does not contain an email address and looks like this:
`<emailInfo><typedEmail><value/></typedEmail></emailInfo>`
I want to extract all email addresses from the ‘xma_data’ field for each record and return the results in a non-technical, human-readable format such as a comma-separated list of emails.
For example:
id; emails;
1; [email protected],[email protected];
2; [email protected],[email protected];
Can anyone suggest how to write a SQL query to achieve this? Thank you.
I tried expressions like:
select (xpath('//value/text()', xmlparse(content xml_data)))[1]::text AS email from contacts
or
select array_to_string(REGEXP_MATCHES(email_AB.xml_data, '<value>(.*?)</value>'), ',') AS email
The problem that these expressions extract the first email, but I need all of them
2
Answers
You need to add the ‘g’ flag to match all:
You made your xpath a lot more complicated that it needed to be.
Fiddle