skip to Main Content

I am trying to create a data flow where i can remove the line break from a csv file and because of it I’m getting an error whenever I tried INSERTING those values in a table.

The problem is I have a CSV file with "Comments" column where the user inputs paragraph like values in a cell for example:

"Caller received something

  1. Mail
  2. Email"

So I need to put this in a BLOB container in Azure where I can use Data flow to remove line breaks but the problem is whenever I use COPY DATA activity to move the file from a directory to BLOB Container, it separates this type of comments into 3 rows. I think the reason here is the row delimeter setting in COPY DATA.

enter image description here

In this image I set it as No delimeterr but it gives me an error like this.

enter image description here

I really need to find an option to put the file using ADF to Blob as it is without changing any values or structures from the CSV. Any suggestions?

Here is the sample data im trying to insert:

enter image description here

In the 2nd row you can notice there’s a 3 line comment in it.
The only reason I’m moving this file to BLOB is so that I can use data flows to remove the line breaks in the comments column and make each comments look like a single sentence. Are there are any approach and suggestions to do this? I’ve tried doing this solution but it doesn’t work on CSV files.

3

Answers


  1. Chosen as BEST ANSWER

    I found an answer to this. So I still used Data Flows Derived Column to manipulate the comments but I used nested replace function to make sure all the possible line breaks are gonna be handled. Including trailing spaces before and after linebreaks.

    replace(replace(replace(replace(Comments, "n", ""),"", "")," n", ""),"n ", "")


  2. If you want ADF to copy the files to blob storage as is, without changing them, use the binary data format in your source and sink datasets.
    enter image description here

    Login or Signup to reply.
  3. Since the data is wrapped with quotes, you can use Double quote (") as quote character and default (rn) as row delimiter.

    • Sample data with line breaks is taken.
      enter image description here

    When no quote character is given, data is divided into three rows.

    enter image description here

    • CSV dataset configuration is given as in below image.
    "columnDelimiter": ",",
    "escapeChar": "\",
    "quoteChar": """
    

    When preview data is clicked, data is previewed in a single line itself.

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search