skip to Main Content

I have these messy fields that I need to cleanup in Dataflow. EAN field can have any number of elements separated by ",". But the actual EAN is always the last one. However it seems that there is no way I can get this simple split function to work properly. It always returns only one element with the full content.

What am I doing wrong here?
Debugged result of split, which is not split

Result of the code that should work

2

Answers


  1. Chosen as BEST ANSWER

    The reason for this problem was in the messy source data. What I though was a normal comma, wasn't instead it was some strange full width comma eg Unicode Character U+FF0C. The code below made this finally work, it is ugly though.

    split(trim(regexReplace(regexReplace(EAN, '[\x09\r\n\t]', ''),'[\uFF0C]',',')), ",")[size(split(trim(regexReplace(regexReplace(EAN, '[\x09\r\n\t]', ''),'[\uFF0C]',',')),','))]
    

  2. EAN field can have any number of elements separated by ",". But the actual EAN is always the last one.

    To get actual EAN from your all the comma separated value you need to use below expression.

    split({_col0_}, ",")[size(split({_col0_}, ","))]
    

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search