skip to Main Content

Given a name_loc column of text like the following:

{"Charlie – White Plains, NY","Wrigley – Minneapolis, MN","Ana – Decatur, GA"}

I’m trying to extract the names, ideally separated by commas:

Charlie, Wrigley, Ana

I’ve gotten this far:

SELECT SUBSTRING(CAST(name_loc AS VARCHAR) from '"([^ –]+)')
FROM table;

which returns

Charlie

How can I extend this query to extract all names?

3

Answers


  1. You can do this with a combination of regexp_matches (to extract the names), array_agg (to regroup all matches in a row) and array_to_string (to format the array as you’d like, e.g. with a comma separator):

    WITH input(name_loc) AS (
        VALUES ('{"Charlie – White Plains, NY","Wrigley – Minneapolis, MN","Ana – Decatur, GA"}')
             , ('{"Other - somewhere}') -- added this to show multiple rows do not get merged
    )
    SELECT array_to_string(names, ', ')
    FROM input
    CROSS JOIN LATERAL (
        SELECT array_agg(name)
        FROM regexp_matches(name_loc, '"(w+)', 'g') AS f(name)
    ) AS f(names);
    
    array_to_string
    Charlie, Wrigley, Ana
    Other

    View on DB Fiddle

    Login or Signup to reply.
  2. My two cents, though I’m rather new to postgreSQL and I had to copy the 1st piece from @Marth’s his answer:

    WITH input(name_loc) AS (
        VALUES ('{"Charlie – White Plains, NY","Wrigley – Minneapolis, MN","Ana – Decatur, GA"}')
             , ('{"Other - somewhere"}') 
    )
    SELECT REGEXP_REPLACE(name_loc, '{?(,)?"(w+)[^"]+"}?','12', 'g') FROM input;
    
    regexp_replace
    Charlie,Wrigley,Ana
    Other
    Login or Signup to reply.
  3. Your string literal happens to be a valid array literal.
    (Maybe not by coincidence? And the column should be type text[] to begin with?)

    If that’s the reliable format, there is a safe and simple solution:

    SELECT t.id, x.names
    FROM   tbl t
    CROSS  JOIN LATERAL (
       SELECT string_agg(split_part(elem, ' – ', 1), ', ') AS names
       FROM   unnest(t.name_loc::text[]) elem
       ) x;
    

    Or:

    SELECT id, string_agg(split_part(elem, ' – ', 1), ', ') AS names
    FROM   (SELECT id, unnest(name_loc::text[]) AS elem FROM tbl) t 
    GROUP  BY id;
    

    db<>fiddle here

    Steps

    1. Unnest the array with unnest() in a LATERAL CROSS JOIN, or directly in the SELECT list.

    2. Take the first part with split_part(). I chose ' – ' as delimiter, not just ' ', to allow for names with nested space like "Anne Nicole". See:

    3. Aggregate results with string_agg(). I added no particular order as you didn’t specify one.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search