skip to Main Content

I have a very small test base, with 5 documents and 4 arguments. I want to update these documents when importing a new file (add new fields, replace old values with new ones) during the JSON file import process.

enter image description here

Previously, I was able to do this process in the process of importing a CSV file.

Code for CSV file:

    def update_and_add_with_csv(self, data, key):

        """ The function update all documents in collection databases using csv file
        (add new columns and change old value). Using pandas """

        df = pd.read_csv(data, low_memory=False)
        df = df.to_dict('records')

        key = key

        try:
            startTime = time.time()

            for row in df:
                self.collection.update_one({key: row.get(key)},  {'$set': row}, upsert=True)
            endTime = time.time()
            totalTime = endTime - startTime
            totalTime = str('{:>.3f}'.format(totalTime))

How can this be done with JSON?

JSON file like this:

enter image description here

2

Answers


  1. Chosen as BEST ANSWER

    Yes, exactly, it works in a similar way. Might be useful to someone

        def update_and_add_with_json(self, data, key):
    
            """ The function update all documents in collection databases using JSON file """
    
            with open(data) as file:
                file_data = json.load(file)
    
            key = key
    
            try:
                startTime = time.time()
    
                for row in file_data:
                    self.collection.update_one({key: row.get(key)},  {'$set': row}, upsert=True)
                endTime = time.time()
                totalTime = endTime - startTime
                totalTime = str('{:>.3f}'.format(totalTime))
    

  2. I think the best way to do this is to not update these documents but replace them.

    I’m assuming your date fields can be used as unique identifiers.

    def update_and_add_with_json(self, file_path):
    
        """ The function update all documents in collection databases using JSON file """
    
        file_data = json.load(open(file_path, "r"))
        start_time = time.time()
        for record in file_data:
        
            replace = self.collection.find_one_and_replace({"date": record["date"]}, record)
        end_time = time.time()
        total_time = end_time - start_time
        total_time = str('{:>.3f}'.format(total_time))
        return total_time
    

    Not sure how your json file is formatted but if it is formatted the same way as your schema this should work and make it easier to add fields dynamically and take advantage of MongoDBs structure less feature.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search