I’m connecting to Postgres 15.1 in R 4.3.0 using DBI 1.2.2 and dbplyr 2.3.2.
Here is a statement that works:
> my_table %>% mutate(date = as.Date(created_at)) %>% head %>% pull(date)
[1] "2023-09-10" "2023-11-29" "2023-12-03" "2024-04-12" "2023-12-07" "2024-04-27"
But this doesn’t:
> my_table %>% mutate(date = as.Date(created_at, tz="Asia/Dhaka")) %>% head %>% pull(date)
Error in `purrr::pmap()`:
ℹ In index: 1.
ℹ With name: date.
Caused by error in `as.Date()`:
! unused argument (tz = "Asia/Dhaka")
Run `rlang::last_trace()` to see where the error occurred.
But this does:
> my_table %>% collect %>% mutate(date = as.Date(created_at, tz="Asia/Dhaka")) %>% head %>% pull(date)
[1] "2023-09-10" "2023-11-30" "2023-12-03" "2024-04-13" "2023-12-07" "2024-04-27"
That leads me to believe that dbplyr
doesn’t know how to use the "tz" argument of as.Date.
Can I get the database to return dates in a given timezone, or do I have to collect all my datasets?
I’ve seen mention of timezone_out
for Postgres’s dbConnect, but I haven’t been able to get it to do what I think it should.
2
Answers
It worked for me to set the time zone on the connection, as from this answer.
Line 5 shows a difference between UTC date (on the left) and America/Chicago date (on the right). America/Chicago is currently -0600, so I'd expect the date to be one day earlier, which it is.
===
More detail on why I think
timezone_out
won't work: it does something, but I don't think I can get it to display the right date without collect.It's displaying created_at in my laptop timezone ("CST"):
but it's the same UTC timestamp:
However, I think
date
field is the UTC date (see line 5, which should be 2023-11-30 but is 2023-12-01), and I can't shift it without collect.You may find the
timezone
argument to be useful. See here. You may want to check the data types in the underlying PostgreSQL table. Things will be much more predictable if this is (or is cast to)TIMESTAMP WITH TIME ZONE
(perhaps using::TIMESTAMPTZ
?).