From the Shopify API, I receive a link to a large amount of JSONL. Using NodeJS, I need to read this data line-by-line, as loading it all at once would use lots of memory. When I hit the JSONL url from the web browser, it automatically downloads the JSONL file to my downloads folder.
Example of JSONL:
{"id":"gid://shopify/Customer/6478758936817","firstName":"Joe"}
{"id":"gid://shopify/Order/5044232028401","name":"#1001","createdAt":"2022-09-16T16:30:50Z","__parentId":"gid://shopify/Customer/6478758936817"}
{"id":"gid://shopify/Order/5044244480241","name":"#1003","createdAt":"2022-09-16T16:37:27Z","__parentId":"gid://shopify/Customer/6478758936817"}
{"id":"gid://shopify/Order/5057425703153","name":"#1006","createdAt":"2022-09-27T17:24:39Z","__parentId":"gid://shopify/Customer/6478758936817"}
{"id":"gid://shopify/Customer/6478771093745","firstName":"John"}
{"id":"gid://shopify/Customer/6478771126513","firstName":"Jane"}
I’m unsure how to process this data in NodeJS. Do I need to hit the url, download all of the data and store it in a temporary file, then process the data line-by-line? Or can I read the data line-by-line directly after hitting the url (via some sort of stream?) and process it without storing it in a temporary file on the server?
(The JSONL comes from https://storage.googleapis.com/ if that helps.)
Thanks.
2
Answers
using
axios
you can set the response to be astream
, and then using a buildin readline module, you can process your data line by line.testing this there is barely any memory usage
You can easily run a great CLI tool called jq. Magic.
Unlike tying yourself to browser code, this code can be run in any way you need to parse JSONL.
Would take my nicely just downloaded bulk file from a query and give me a very nice pure JSON data structure to play with, or save.