skip to Main Content

I am importing csv using pymongo then inserting it into mongodb but due to some reason all field is in format of string where i was expecting double. Below is python code .

def saveToMongo():
print("inside saveToMongo")
collection = mydb['country']

header = ['country_id','country_name','zone_id','minLat','maxLat','minLong','maxLong']
csvFile = open('country.csv', 'r')
reader = csv.DictReader(csvFile)

print(reader)

for each in reader:
    row = {}
    for field in header:
        row[field] = each[field]

    #print(row)
    collection.insert(row)

and here is the csv file

country_id,country_name,zone_id,minLat,maxLat,minLong,maxLong
2,Bangladesh,1,20.6708832870000,26.4465255803000,88.0844222351000,92.6727209818000
3,"Sri Lanka",1,5.9683698592300,9.8240776636100,79.6951668639000,81.7879590189000
4,Pakistan,1,23.6919650335000,37.1330309108000,60.8742484882000,77.8374507995000
5,Bhutan,1,26.7194029811000,28.2964385035000,88.8142484883000,92.1037117859000

I am unable to understand why python is storing data in String format.

When i try to insert data using mongoimport I’m getting belopw error

fatal error: unrecognized DWARF version in .debug_info at 6

runtime stack:
panic during panic

runtime stack:
stack trace unavailable

2

Answers


  1. The csv.DictReader just parse any data as String.

    If you want another format you can create custom function to parse them in the type you want.

    Considering that what you call double is a float in python, you could do something as follow :

    from pymongo import MongoClient 
    import csv
    myclient = MongoClient("mongodb://localhost:27017/")
    mydb = myclient["mydbname"]
    
    def csvToMongo():
      with open('country.csv','r') as myFile :
        reader = csv.DictReader(myFile, delimiter=",")
        myParsedData = [
          {
            'country_id' : int(elem['country_id']),
            'country_name' : elem['country_name'],
            'zone_id' : int(elem['zone_id']),
            'minLat' : float(elem['minLat']),
            'maxLat' : float(elem['maxLat']),
            'minLong' : float(elem['minLong']),
            'maxLong' : float(elem['maxLong']),
          }
          for elem in reader
        ]
        collection = mydb['country']
        collection.insert_many(myParsedData)
    
    #execute function
    csvToMongo()
    
    Login or Signup to reply.
  2. Python will read all CSV data in string format. This has nothing to do with MongoDB or pymongo inserting data in string format.

    If you want type inference & automatic conversion from your CSV data, use something like Pandas read_csv.

    Otherwise, do the type conversion in a similar loop with header – create a mapping between the field names and field types, then convert each field to that type:

    header = {
        'country_id': int, 'country_name': str, 'zone_id': int,
        'minLat': float, 'maxLat': float, 'minLong': float, 'maxLong': float
    }
    
    for row in reader:
        doc = {}
        for field, ftype in header.items():
            doc[field] = ftype(row[field])
        
        # the `doc=` + loop lines can also be written in one line as:
        # doc = {field: ftype(row[field]) for field, ftype in header.items()}
    
        collection.insert(row)
    

    (Without that, the loop to read the data per row and assign to each key in a dict is quite redundant. Could be just collection.insert(each))

    Btw, if your file isn’t big, read & convert all the rows and then use insert_many.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search