skip to Main Content

I am trying to load a JSON file (output from Mongo’s Export-MdbcData) into a SQL Server table using Powershell. The example JSON file data is as follows:

{ "code" : "0088", "name" : "BUTTON", "detail" : { "quantity" : 1 } }
{ "code" : "0081", "name" : "MATTERHORN", "detail" : { "quantity" : 2 } }
{ "code" : "0159", "name" : "BANKSTON", "detail" : { "quantity" : 1 } }

In the Powershell script below, the file is read into an array and the array is converted into a datatable to load into a SQL server table. Is there a better/faster way to read in the JSON file? With a small input file, it only takes seconds to load the data but with more than 4M records, it is taking hours for the whole process.

$encoding = [System.Text.Encoding]::UTF8    
$output = [System.Collections.ArrayList]::new()

foreach ($line in [System.IO.File]::ReadLines($pathToJsonFile, $encoding)) 
{
    $json = $line | ConvertFrom-Json 
    foreach ($detail in $json.detail) 
    {
       [void]$output.Add(
                          [pscustomobject]@{
                                      code = $json.code
                                      name = $json.name
                                      quantity = $detail.quantity
                                    }
                        )
    } 
} 
$dataTable = [System.Data.DataTable]::new()
$dataTable = $output | ConvertTo-DataTable
.
.

2

Answers


  1. Assuming you have enough memory, you can greatly speed up your processing as follows:

    $dataTable =
      '[' + (Get-Content -Raw -Encoding utf8 $pathToJsonFile) + ']' |
        ConvertFrom-Json |
        ForEach-Object {
          foreach ($detail in $_.detail) { 
            [pscustomobject]@{
              code = $_.code
              name = $_.name
              quantity = $detail.quantity
             }
          }
        } |
        ConvertTo-DataTable
    
    Login or Signup to reply.
  2. Might be faster to just create and add the data directly to the datatable, and not use ArrayList or pscustomobject

    $dataTable = [System.Data.DataTable]::new();
    [void]$dataTable.Columns.Add('code', [string]);
    [void]$dataTable.Columns.Add('name', [string]);
    [void]$dataTable.Columns.Add('quantity', [int]);
    
    $encoding = [System.Text.Encoding]::UTF8;
    
    foreach ($line in [System.IO.File]::EnumerateLines($pathToJsonFile, $encoding)) 
    {
        $json = $line | ConvertFrom-Json;
        foreach ($detail in $json.detail) 
        {
            [void]$dataTable.Rows.Add($json.code, $json.name, $detail.quantity);
        }
    }
    

    You may also want to pre-allocate the datatable capacity to some large enough capacity to prevent resizing of the underlying array.

    $dataTable.MinimumCapacity = 4100000;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search