How to split the string based on white spaces in DataFlow expression builder in ADF pipeline.
A string can have 1 or more white spaces.
Eg:
I used split(name,’ ‘) : Thiswill split the word based on single space.Some times it can have more than 1 space. How to handle that?
Joe Smith(1 white space)
Joel Smith(2 white space)
Joplin Smith(3 white space)
2
Answers
This is my sample data with spaces.
Use the below expression to get the words array without empty strings.
filter(split(column,' '),#item!='')
As there is no join function in dataflow expressions, to get the string of words with one space, use below expression from this SO Answer by @Jarred Jobe.
I notice a derivative of this problem from other programming langauges.
If you want to remove multiple extra spaces then you can use 3 Replace() functions in a row to replace them with a single space.
If we set a parameter "newparam" to your example
"Joplin Smith"
, we can use 3 replaces like so:@replace(replace(replace(pipeline().parameters.newparam, ' ', ' %'), '% ', ''), '%', '')
Output after each replace:
At this point you can wrap the command with the split() function and it will only intercept single spaces:
Produces the array: