I am trying to load a JSON file (output from Mongo’s Export-MdbcData) into a SQL Server table using Powershell. The example JSON file data is as follows:
{ "code" : "0088", "name" : "BUTTON", "detail" : { "quantity" : 1 } }
{ "code" : "0081", "name" : "MATTERHORN", "detail" : { "quantity" : 2 } }
{ "code" : "0159", "name" : "BANKSTON", "detail" : { "quantity" : 1 } }
In the Powershell script below, the file is read into an array and the array is converted into a datatable to load into a SQL server table. Is there a better/faster way to read in the JSON file? With a small input file, it only takes seconds to load the data but with more than 4M records, it is taking hours for the whole process.
$encoding = [System.Text.Encoding]::UTF8
$output = [System.Collections.ArrayList]::new()
foreach ($line in [System.IO.File]::ReadLines($pathToJsonFile, $encoding))
{
$json = $line | ConvertFrom-Json
foreach ($detail in $json.detail)
{
[void]$output.Add(
[pscustomobject]@{
code = $json.code
name = $json.name
quantity = $detail.quantity
}
)
}
}
$dataTable = [System.Data.DataTable]::new()
$dataTable = $output | ConvertTo-DataTable
.
.
2
Answers
Assuming you have enough memory, you can greatly speed up your processing as follows:
Might be faster to just create and add the data directly to the datatable, and not use
ArrayList
orpscustomobject
You may also want to pre-allocate the datatable capacity to some large enough capacity to prevent resizing of the underlying array.