skip to Main Content

Hi I am building a Pipeline to convert Xml to csv,
In Nifi first I have converted XML data to Json then From Json I have converted to CSV but My out put file is in not csv can any one help me with this

myxmldata

 <?xml version="1.0" encoding="UTF-8"?> 
 <bookstore>
 <book category="COOKING">
 <title lang="en">Everyday Italian</title>
 <author>Giada De Laurentiis</author>
 <year>2005</year>
 <price>30.00</price>
 </book>
 <book category="CHILDREN">
 <title lang="en">Harry Potter</title>
 <author>J K. Rowling</author>
 <year>2005</year>
  <price>29.99</price>
  </book>
   <book category="WEB">
  <title lang="en">XQuery Kick Start</title>
   <author>James McGovern</author>
  <author>Per Bothner</author>
  <author>Kurt Cagle</author>
  <author>James Linn</author>
 <author>Vaidyanathan Nagarajan</author>
 <year>2003</year>
 <price>49.99</price>
  </book>
 <book category="WEB">
  <title lang="en">Learning XML</title>
   <author>Erik T. Ray</author>
   <year>2003</year>
    <price>39.95</price>
   </book>
   </bookstore>

my pipeline image:-
enter image description here

after converting XML to json my json data is as below

[{"book":[{"title":"Everyday Italian","author":"Giada De 
 Laurentiis","year":2005,"price":30.0},{"title":"Harry 
 Potter","author":"J K. Rowling","year":2005,"price":29.99}, 
 {"title":"XQuery Kick Start","author":"Vaidyanathan 
  Nagarajan","year":2003,"price":49.99},{"title":"Learning 
  XML","author":"Erik T. Ray","year":2003,"price":39.95}]}]

untill here it fine but while converting it to csv I am getting Wrong csv data
my convert json to csv processor image
enter image description here

My csv output data is as below

book
"[MapRecord[{title=Everyday Italian, year=2005, author=Giada De 
Laurentiis, price=30.0}], MapRecord[{title=Harry Potter, year=2005, 
 author=J K. Rowling, price=29.99}], MapRecord[{title=XQuery Kick 
 Start, year=2003, author=Vaidyanathan Nagarajan, price=49.99}], 
 MapRecord[{title=Learning XML, year=2003, author=Erik T. Ray, 
  price=39.95}]]"

2

Answers


  1. A couple things are happening here.

    First, you only need the JoltTransformJSON processor if you are changing the content being operated on (i.e. filtering fields, transforming arrays, etc.). If not, you can remove that processor and use ConvertRecord to go directly from XML to CSV. You do not need an intermediate JSON representation.

    Secondly, the reason your CSV is not as you expected is because the intermediate JSON has arrays as values for keys. There should be XPath expressions which allow you to read from an array and translate to a row of CSV. If you cannot build one, I would suggest using a ScriptedRecordReader and/or ScriptedRecordSetWriter to translate from a serialized form to a complex internal format and back to serialized.

    I would suggest trying to go directly from XML to CSV and then make changes to the data unless necessary data is lost in the transformation. If it is, the additional translation steps may be required.

    Login or Signup to reply.
  2. What is your expected result?
    Your XML is not an easy conversion as you have many authors inside the book.
    It might be also good to redirect the fails to a log message so you can better see it.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search