Basically, I want to remove the whitespace that exists after numbers
Input:
medication_title
CLORIDRATO DE VENLAFAXINA 75 MG
VIIBRYD 40 MG
KTRIZ UNO 0.6 U/G
Ouput:
medication_title medication_title2
CLORIDRATO DE VENLAFAXINA 75 MG CLORIDRATO DE VENLAFAXINA 75MG
VIIBRYD 40 MG VIIBRYD 40MG
KTRIZ UNO 0.6 U/G KTRIZ UNO 0.6U/G
Ideas?
2
Answers
We can use a regex replacement here:
Demo
Here is an explanation of the regex pattern:
y
word boundary(
match and capture in1
d+
a number(?:.d+)?
followed by optional decimal component)
close capture group1
(
match and capture in2
[^[:space:]]*
zero or more leading non whitespace charactersG
folllwed by "G")
close capture group2
y
another word boundaryYou can capture the sequences with a regular expression and then assemble them back as needed, as in
regexp_replace(x, '([^0-9]*[0-9]) +([^0-9.]+)', '12')
.For example:
Result: