In my project (Django, using Postgresql) I need to display statistics on the number of registered users by day. In the database, all time fields are stored with the UTC zone. In settings.py parameters set:
USE_TZ = True
TIME_ZONE = 'Europe/Moscow'
Code in a view that returns this stitistic:
(date1 & date2 parameters are obtained from a request)
self.queryset = User.objects
if date1 and date2:
self.queryset = self.queryset.filter(
created_at__range=(date1, date2.replace(hour=23, minute=59, second=59))
)
self.queryset = self.queryset
.extra({'day': "date(created_at)"})
.values("day")
.annotate(
count=Count("id"),
site_count=Count("id", filter=Q(account_type=User.TYPE_WEBSITE)),
bot_count=Count("id", filter=Q(account_type=User.TYPE_TELEGRAM)),
)
.order_by('day')
return Response({
"new_users": self.queryset
})
Unfortunately, if I specify date parameters both ‘2021-05-11’, then the "wrong" answer will be formed:
"new_users": [
{
"day": "2021-05-10",
"count": 2,
"site_count": 1,
"bot_count": 1
},
{
"day": "2021-05-11",
"count": 4,
"site_count": 2,
"bot_count": 2
}
],
We see the two days instead of one.
The following SQL is generated by Django:
SELECT (date(created_at)) AS "day", COUNT("authentication_user"."id") AS "count",
COUNT("authentication_user"."id") FILTER
(WHERE "authentication_user"."account_type" = 'Website') AS "site_count", COUNT("authentication_user"."id") FILTER
(WHERE "authentication_user"."account_type" = 'Telegram') AS "bot_count"
FROM "authentication_user" WHERE "authentication_user"."created_at" BETWEEN '2021-05-11 00:00:00+03:00' AND '2021-05-11 23:59:59+03:00'
GROUP BY (date(created_at) ) ORDER BY "day" ASC
I experimented and made select queries manually:
SELECT * from authentication_user WHERE created_at BETWEEN '2021-06-14 00:00:00+03:00' AND '2021-06-14 23:59:59+03:00'
As a result of this request, a correct result is generated, but it is generated in the UTC time zone. I see a lot of records with creation dates between 2021-06-13 21:00 and 2021-06-14 21:00.
I can see that the initial query also produces the correct answer, but due to the fact that each resulting date in the UTC zone, the
SELECT (date (created_at)) AS "day"
creates a split into two days.
Please tell me how I can get postgresql to form a date-based split of results based on my time zone.
Thank you in advance!
2
Answers
Solved:
python:
What you did works, but it’s not ideal. The django docs recommend you use
extra()
only as a last resort, because this tool will be deprecated:https://docs.djangoproject.com/en/3.2/ref/models/querysets/#extra
Instead, to make your application more flexible and avoid dropping down to raw SQL unnecessarily, use
activate()
to set the timezone for a given query:https://docs.djangoproject.com/en/3.2/topics/i18n/timezones/#default-time-zone-and-current-time-zone
https://docs.djangoproject.com/en/3.2/ref/utils/#django.utils.timezone.activate