I am using NiFi to load data from database A in json record format and want to insert json record into a json column of a table in database B.
A => ExecuteSQLRecord => [jsonRow,...] => [?] => B.table(data json)
which processor should I use at the position [?]. I tried jolt and PutDatabaseRecord, but failed:
[jsonRow,...] => jolt =>[{data:jsonRow},...]=> PutDatabaseRecord => B.Table(data json)
I got error:
Batch entry 0 insert into schema.table(data) values((‘MapRecord [{…}]’)) was aborted: ERROR: invalid input syntax for type json Detail: Token "MapRecord" is invalid. Where: JSON data, line 1: MapRecord… unnamed portal parameter $1 = ‘…’ Call getNextException to see other errors in the batch.
GenerateFlowFile=>JoltTransformJson=>PutDatabaseRecord
the data path is:
[{id,name},{id,nam}]=>[{data:{id,name}},{data:{id,name}}]=>pgTable(id serial,data json)
along this flow, pg get two rows data as expected. It told me it’s feasible to do it simply using jolt and putdatabaserecord. The problem emerges however in the production env. In spite of it, to keep flow simple as possible, i’d like to prior to make deep test and further reading to find out the reason embeded in production env.
[2024-08-03] New test shows that null plays role in the problem. If a json row has any property with null value, it would be embraced with ‘MapRecord[(…)]’ before sent to db. That leads to json parsing error. However I don’t want to remove null property except I have no alternative because field omission test is also a part of data validation in db. Keeping null value and removing the wrappage of ‘MapRecord’ is the preference.
2
Answers
After quite a few trying, I found a not elegant but make-do method:
According to this Jira, you can "stringify" the JSON and it should work with
PutDatabaseRecord
, at least for PostgreSQL and MySQL. You should be able to useUpdateRecord
with theescapeJson()
RecordPath function to convert your JSON object to a string, then you can continue with PutDatabaseRecord. If UpdateRecord doesn’t work, you could use ExecuteScript with Groovy to convert the JSON to a string and re-infer the schema using something similar to the technique I show in this blog post.