skip to Main Content

I have been trying to make changes to a csv invoice that my company gets on a monthly basis. However, the File is 1.5 GB in size and contains over 1.500.000 rows with 36 columns. While I can open the file in Microsoft Excel, trying to edit the data results in a majority of the rows/columns getting deleted without my knowledge. Trying to go in and edit it results in me loosing about 16MB of data. I wrote a PowerShell script to split the file up into two different files, but the problem with that is that one of the CSVs is missing a header and I have to manually add it in. I also have the same issue of some data being deleted despite being under the 1,048,576 rows limit that Excel has.

I have tried updating the CSV in Visual Studio Code and using the Rainbow CSV extension, but I’m seeing Visual Studio crash every time I try to open the CSV. I have also tried using Notepad++ and using the CSV Lite plugin, but that doesn’t have a filter option to make the operation easier. I was wondering if anyone had suggestions for editing a CSV file this big or if they know a way to edit the data in Excel without the risk of data being deleted on accident?

Thank you!

  • Editing file in Excel, but random data gets deleted without my consent.
  • Editing file in Visual Studio Code, but it constantly crashes.
  • Editing file in Notepad++, but the CSV Lite plugin isn’t as friendly as Excel for filtering data.

2

Answers


  1. Chosen as BEST ANSWER

    @MagineMozios had the right idea. It is very easy to write a PowerShell script to edit the CSV data without deleting anything. Here is the PowerShell script I made to edit the data of my 1.5GB CSV File.

    Code:

    # Define the path to the CSV file
    $csvPath = "C:Usersyourfile.csv"  # Replace 'yourfile.csv' with your actual CSV file name
    
    # Load the CSV into a variable
    $csvData = Import-Csv -Path $csvPath
    
    # Iterate through each row in the CSV
    foreach ($row in $csvData) {
        # Check if any column contains "SubscriptionName1" and update the "CostCenter" column to "####"
        if ($row.PSObject.Properties.Value -contains "SubscriptionName1") {
            $row."CostCenter" = "####"
        }
        # Check if any column contains "SubscriptionName2" and update the "CostCenter" column to "####"
        elseif ($row.PSObject.Properties.Value -contains "SubscriptionName2") {
            $row."CostCenter" = "####"
        }
        
        elseif ($row.PSObject.Properties.Value -contains "SubscriptionName3") {
            $row."CostCenter" = "####"
        }
    }
    
    # Export the updated CSV
    $csvData | Export-Csv -Path $csvPath -NoTypeInform
    

    Thank you to everyone who contributed!


  2. EmEditor will be a good choice for so large CSV file.
    https://www.emeditor.com/

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search