Hi I am building a Pipeline to convert Xml to csv,
In Nifi first I have converted XML data to Json then From Json I have converted to CSV but My out put file is in not csv can any one help me with this
myxmldata
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
after converting XML to json my json data is as below
[{"book":[{"title":"Everyday Italian","author":"Giada De
Laurentiis","year":2005,"price":30.0},{"title":"Harry
Potter","author":"J K. Rowling","year":2005,"price":29.99},
{"title":"XQuery Kick Start","author":"Vaidyanathan
Nagarajan","year":2003,"price":49.99},{"title":"Learning
XML","author":"Erik T. Ray","year":2003,"price":39.95}]}]
untill here it fine but while converting it to csv I am getting Wrong csv data
my convert json to csv processor image
My csv output data is as below
book
"[MapRecord[{title=Everyday Italian, year=2005, author=Giada De
Laurentiis, price=30.0}], MapRecord[{title=Harry Potter, year=2005,
author=J K. Rowling, price=29.99}], MapRecord[{title=XQuery Kick
Start, year=2003, author=Vaidyanathan Nagarajan, price=49.99}],
MapRecord[{title=Learning XML, year=2003, author=Erik T. Ray,
price=39.95}]]"
2
Answers
A couple things are happening here.
First, you only need the
JoltTransformJSON
processor if you are changing the content being operated on (i.e. filtering fields, transforming arrays, etc.). If not, you can remove that processor and useConvertRecord
to go directly from XML to CSV. You do not need an intermediate JSON representation.Secondly, the reason your CSV is not as you expected is because the intermediate JSON has arrays as values for keys. There should be XPath expressions which allow you to read from an array and translate to a row of CSV. If you cannot build one, I would suggest using a
ScriptedRecordReader
and/orScriptedRecordSetWriter
to translate from a serialized form to a complex internal format and back to serialized.I would suggest trying to go directly from XML to CSV and then make changes to the data unless necessary data is lost in the transformation. If it is, the additional translation steps may be required.
What is your expected result?
Your XML is not an easy conversion as you have many authors inside the book.
It might be also good to redirect the fails to a log message so you can better see it.