skip to Main Content

I am using Python 3.7 on Debian 10

I have a number of pre-existing .csv files with these columns:
first_name, last_name, birthdate, phone, email
I am importing them into a postgres database with Django as the framework.

My Django model:

from django.db import models

class User(models.Model):
    first_name = models.TextField(blank=False, null=False)
    last_name = models.TextField(blank=False, null=False)
    birthdate = models.TextField(blank=True, null=True)
    phone = models.TextField(blank=False, null=False)
    email = models.TextField(blank=False, null=False)

Custom Django Management Command to import file import_users.py:

class Command(BaseCommand):

    def handle(self, *args, **options):

        users_file = open(f'{settings.DATA_IMPORT_LOCATION}/users.csv', 'r')

        for counter, line in enumerate(users_file):
        
            line_fields = line.split(',')

            first_name = line_fields[0]
            last_name = line_fields[1]
            birthdate = line_fields[2]
            phone = line_fields[3]
            email = line_fields[4]

            u = User()
            u.first_name = first_name
            u.last_name = last_name
            u.birthdate = birthdate
            u.phone = phone
            u.email = email
            u.save()

Output sample when running the following Django ORM query :

> for u in User.objects.all():
        print(u.birthdate)

Output:

birthdate

2015-05-28

2009-06-14

2007-01-01

2007-02-17

2008-05-16

2013-01-19

2008-07-24

2015-05-01

2007-06-03

2007-01-17

When birthdate = models.TextField is set to TextField, I can import these .csv files into my Postgres database successfully with my management command.

This makes sense because all the .csv data are strings.

However, I want to correctly set the model to read as a date, i.e. birthdate = models.DateField() in order to render further calculations, such as finding user’s birthdays within certain time deltas

When making this change to DateField and then trying to import the .csv to database with command ./manage.py import_users, I receive the following error:
" ValueError: time data ‘birthdate’ does not match format ‘%Y-%m-%d’ "

I have tried converting the .csv birthdate data into a datetime object in import_users.py using commands such as:

 u.birthdate = datetime.datetime.strptime(birthdate, "%Y-%m-%d")

and many variations on this, but the error message is the same.

I think I am not understanding how to correctly modify the data of a given column. I have also changed my .csv import code to read with a ‘w’ instead of ‘r’
(users_file = open(f'{settings.DATA_IMPORT_LOCATION}/users.csv', 'w')) but this does not help.

I am encountering this problem when I try to change the models.py fields to anything other than TextField. Again, this conceptually makes sense since I’m asking Django to work on pre-existing data that is inherently a string.

I am unclear on which point in the import process I should be modifying the data type to convert to integers for dates, phone numbers, birthdays, etc.

I would dearly appreciated any guidance, be it specific lines of code or any meta-explanations of the principles behind this so I can apply understanding to these and future problems.

Thank you!

2

Answers


  1. Chosen as BEST ANSWER

    Thank you, KuroiKuro! This was helpful. I realised right after posting that I was asking to convert the header rather than the data in the rows following.

    Here's what I ended up doing:

    with open(f'{settings.DATA_IMPORT_LOCATION}/users.csv', 'r') as users_csv:
    
                users_file = csv.reader(users_csv)
                next(users_file)  # skip header row
    
                for counter, line in enumerate(users_file):
    
                    first_name = line[0]
                    last_name = line[1]
                    birthdate = datetime.datetime.strptime(line[2], '%Y-%m-%d')
                    birthdate = birthdate.strftime('%Y-%m-%d')  # to remove the time component
                    phone = line[3]
                    email = line[4]
    
                    u = User()
                    u.first_name = first_name
                    u.last_name = last_name
                    u.birthdate = birthdate
                    u.phone = phone
                    u.email = email
                    u.save()
    

  2. I have tested this using Django 3.2.2 and a Postgres docker container, and I was able to simply save the date string (e.g. "2007-02-17") without any extra type conversions. When retrieving the saved model, Django returns the date field as a datetime.date object.

    >>> u.birthdate = "2007-02-17"
    >>> u.save()
    >>> u = TestModel.objects.last()
    >>> u.birthdate
    datetime.date(2007, 2, 17)
    

    Perhaps you can try printing the value of line_fields[2] before saving it to birthdate, to check if the correct value is being saved.

    As a side note, opening the file with "w" does not help as it will open a file for writing only, which will erase all the contents of the file.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search