skip to Main Content

How to split the string based on white spaces in DataFlow expression builder in ADF pipeline.
A string can have 1 or more white spaces.

Eg: 

I used split(name,’ ‘) : Thiswill split the word based on single space.Some times it can have more than 1 space. How to handle that?

Joe Smith(1 white space)
Joel  Smith(2 white space)
Joplin   Smith(3 white space)

2

Answers


  1. This is my sample data with spaces.

    enter image description here

    column
    Steve Smith C      S  K
    Rakesh Govindula     Laddu
    Chinna R     C      B
    

    Use the below expression to get the words array without empty strings.

    filter(split(column,' '),#item!='')

    enter image description here

    As there is no join function in dataflow expressions, to get the string of words with one space, use below expression from this SO Answer by @Jarred Jobe.

    dropLeft(toString(reduce(filter(split(column,' '),#item!=''), '', #acc +  ' '  + #item, #result)), 0)
    

    enter image description here

    Login or Signup to reply.
  2. I notice a derivative of this problem from other programming langauges.
    If you want to remove multiple extra spaces then you can use 3 Replace() functions in a row to replace them with a single space.

    If we set a parameter "newparam" to your example "Joplin Smith", we can use 3 replaces like so:

    @replace(replace(replace(pipeline().parameters.newparam, ' ', ' %'), '% ', ''), '%', '')

    Output after each replace:

     1 - "value": "Joplin % % %Smith"
     2 - "value": "Joplin %Smith"
     3 - "value": "Joplin Smith"
    

    At this point you can wrap the command with the split() function and it will only intercept single spaces:

    @split(replace(replace(replace(pipeline().parameters.newparam, ' ', ' %'), '% ', ''), '%', '')
    , ' ')
    

    Produces the array:

    "value": [
            "Joplin",
            "Smith"
        ]
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search