skip to Main Content

I have a use case where i need to split strings by multiple delimiters.

client_id
blah_blah
bleh-bleh
select client_id ,split(client_id,'-')[0] col1` ,split(client_id,'-')[1] col2 from mytbl

returns

Client_id col1 col2
bleh-bleh bleh bleh

I have been trying various permutations to get two delimiters in with no success.

select client_id ,split(client_id,'-'||'_')[0] col1` ,split(client_id,'-'||'_')[1] col2 from mytbl

this errors out, but what i want to return is…

client_id col1 col2
blah-blah blah blah
bleh_bleh bleh bleh

3

Answers


  1. You can use the regexp_split_to_array with a delimiter regex that is either a dash or an underscore:

    select 
        client_id,
        (regexp_split_to_array(client_id, '[-_]'))[1] col1, 
        (regexp_split_to_array(client_id, '[-_]'))[2] col2 
        from mytbl;
    

    Here is a fiddle POC: https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/13029

    Login or Signup to reply.
  2. Using a CTE, using regexp_split_to_array only once

    with t as (
     select client_id, regexp_split_to_array(client_id, '-|_') arr
     from the_table
    )
    select client_id, arr[1] col1, arr[2] col2
    from t;
    

    Demo

    Login or Signup to reply.
  3. You can do that exactly how you planned, as long as you translate() all the alternative delimiters to the one you want to use: demo

    select client_id
      , split_part(client_id2,'-',1)
      , split_part(client_id2,'-',2)
    from mytbl cross join lateral
    (values(translate(client_id,'_;/','---'))) as v(client_id2)
    
    client_id split_part split_part
    blah_blah blah blah
    bleh-bleh bleh bleh

    Nice thing about this approach is that it simply runs faster compared to the regex+array ideas. Demo shows this one’s 3x faster on 10k-300k random samples, thanks to being free from the performance penalty that comes with regex and array overheads.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search