I am using Python 3.7 on Debian 10
I have a number of pre-existing .csv files with these columns:
first_name, last_name, birthdate, phone, email
I am importing them into a postgres database with Django as the framework.
My Django model:
from django.db import models
class User(models.Model):
first_name = models.TextField(blank=False, null=False)
last_name = models.TextField(blank=False, null=False)
birthdate = models.TextField(blank=True, null=True)
phone = models.TextField(blank=False, null=False)
email = models.TextField(blank=False, null=False)
Custom Django Management Command to import file import_users.py:
class Command(BaseCommand):
def handle(self, *args, **options):
users_file = open(f'{settings.DATA_IMPORT_LOCATION}/users.csv', 'r')
for counter, line in enumerate(users_file):
line_fields = line.split(',')
first_name = line_fields[0]
last_name = line_fields[1]
birthdate = line_fields[2]
phone = line_fields[3]
email = line_fields[4]
u = User()
u.first_name = first_name
u.last_name = last_name
u.birthdate = birthdate
u.phone = phone
u.email = email
u.save()
Output sample when running the following Django ORM query :
> for u in User.objects.all():
print(u.birthdate)
Output:
birthdate
2015-05-28
2009-06-14
2007-01-01
2007-02-17
2008-05-16
2013-01-19
2008-07-24
2015-05-01
2007-06-03
2007-01-17
When birthdate = models.TextField is set to TextField, I can import these .csv files into my Postgres database successfully with my management command.
This makes sense because all the .csv data are strings.
However, I want to correctly set the model to read as a date, i.e. birthdate = models.DateField() in order to render further calculations, such as finding user’s birthdays within certain time deltas
When making this change to DateField and then trying to import the .csv to database with command ./manage.py import_users, I receive the following error:
" ValueError: time data ‘birthdate’ does not match format ‘%Y-%m-%d’ "
I have tried converting the .csv birthdate data into a datetime object in import_users.py using commands such as:
u.birthdate = datetime.datetime.strptime(birthdate, "%Y-%m-%d")
and many variations on this, but the error message is the same.
I think I am not understanding how to correctly modify the data of a given column. I have also changed my .csv import code to read with a ‘w’ instead of ‘r’
(users_file = open(f'{settings.DATA_IMPORT_LOCATION}/users.csv', 'w')
) but this does not help.
I am encountering this problem when I try to change the models.py fields to anything other than TextField. Again, this conceptually makes sense since I’m asking Django to work on pre-existing data that is inherently a string.
I am unclear on which point in the import process I should be modifying the data type to convert to integers for dates, phone numbers, birthdays, etc.
I would dearly appreciated any guidance, be it specific lines of code or any meta-explanations of the principles behind this so I can apply understanding to these and future problems.
Thank you!
2
Answers
Thank you, KuroiKuro! This was helpful. I realised right after posting that I was asking to convert the header rather than the data in the rows following.
Here's what I ended up doing:
I have tested this using Django 3.2.2 and a Postgres docker container, and I was able to simply save the date string (e.g. "2007-02-17") without any extra type conversions. When retrieving the saved model, Django returns the date field as a datetime.date object.
Perhaps you can try printing the value of
line_fields[2]
before saving it to birthdate, to check if the correct value is being saved.As a side note, opening the file with "w" does not help as it will open a file for writing only, which will erase all the contents of the file.