I am trying to create a data flow where i can remove the line break from a csv file and because of it I’m getting an error whenever I tried INSERTING those values in a table.
The problem is I have a CSV file with "Comments" column where the user inputs paragraph like values in a cell for example:
"Caller received something
- Email"
So I need to put this in a BLOB container in Azure where I can use Data flow to remove line breaks but the problem is whenever I use COPY DATA activity to move the file from a directory to BLOB Container, it separates this type of comments into 3 rows. I think the reason here is the row delimeter setting in COPY DATA.
In this image I set it as No delimeterr but it gives me an error like this.
I really need to find an option to put the file using ADF to Blob as it is without changing any values or structures from the CSV. Any suggestions?
Here is the sample data im trying to insert:
In the 2nd row you can notice there’s a 3 line comment in it.
The only reason I’m moving this file to BLOB is so that I can use data flows to remove the line breaks in the comments column and make each comments look like a single sentence. Are there are any approach and suggestions to do this? I’ve tried doing this solution but it doesn’t work on CSV files.
3
Answers
I found an answer to this. So I still used Data Flows Derived Column to manipulate the comments but I used nested replace function to make sure all the possible line breaks are gonna be handled. Including trailing spaces before and after linebreaks.
If you want ADF to copy the files to blob storage as is, without changing them, use the binary data format in your source and sink datasets.
Since the data is wrapped with quotes, you can use
Double quote (")
as quote character anddefault (rn)
as row delimiter.When no quote character is given, data is divided into three rows.
When preview data is clicked, data is previewed in a single line itself.