skip to Main Content

My table has some null and empty strings, these look ugly when I query the table so I want to replace them with other values. I don’t own the data so I can’t modify the table itself, this needs to happen when I query.

I tried using regexp_replace to replace the empty strings with regexp_replace(column, '^$', '(no value found/some other custom message)' which didn’t work

3

Answers


  1. null and empty string have to be treated separately. In addition, there’s a difference between '' (empty) and ' ' (blank).

    null is "unknown". regexp_replace(null, '^$', 'default') doesn’t do anything because the value it’s matching is unknown. The match fails and it returns null.

    Instead, use coalesce. It returns the first non-null value. coalesce(thing, 'default').

    To capture empty and blank strings, search for ^s*$. regexp_replace(thing, '^s*$', 'default').

    We put them together like so…

    select
      coalesce(regexp_replace(thing, '^s*$', 'default'), 'default')
    from test;
    

    If thing is null, regexp_replace will return null, and coalesce will return 'default'.

    If thing is empty or blank, regexp_replace will return default and coalesce will return default.

    If thing is none of those, regexp_replace will return thing and coalesce will return thing.

    An alternative that might be easier to understand is:

    select
      case 
        when trim(word) = '' then 'default'
        else coalesce(word, 'default')
      end
    from test
    

    If you like, you can put this into a function. We can declare it immutable (the same arguments always produce the same result) potentially giving it a performance boost.

    create function replace_blanks(text, text) returns text
    immutable
    as $$
      select coalesce(regexp_replace($1, '^s*$', $2), $2)
    $$ language sql;
    
    select replace_blanks(word, 'default') from test;
    

    Demonstration in PostgreSQL. Redshift is derived from PostgreSQL so it should work the same.

    Login or Signup to reply.
  2. Convert null to blank using coalesce() before applying your regexp_replace:

    regexp_replace(coalesce(thing, ''), '^s*$', 'default')
    

    With Redshift you can also use nvl() instead of coalesce() for a briefer solution, but coalesce() is the SQL standard and therefore arguably more readable.

    Login or Signup to reply.
  3. This is the fastest way to replace null and empty strings:

    CASE WHEN column <> '' THEN column ELSE 'default' END
    

    Only applicable to string types. Most other data types have no "empty" value.

    This uses only a very basic CASE expression, so it works in any RDBMS (certainly including Postgres and Redshift) that handles null values according to the SQL standard.

    Another standard SQL way:

    COALESCE(NULLIF(column, ''), 'default')
    

    See:

    Regular expressions are way to expensive for the simple task (IMO).

    If you have to deal with non-printing characters, you need to do more. See:

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search