skip to Main Content

I’m connecting to Postgres 15.1 in R 4.3.0 using DBI 1.2.2 and dbplyr 2.3.2.

Here is a statement that works:

> my_table %>% mutate(date = as.Date(created_at)) %>% head %>% pull(date)
[1] "2023-09-10" "2023-11-29" "2023-12-03" "2024-04-12" "2023-12-07" "2024-04-27"

But this doesn’t:

> my_table %>% mutate(date = as.Date(created_at, tz="Asia/Dhaka")) %>% head %>% pull(date)
Error in `purrr::pmap()`:
ℹ In index: 1.
ℹ With name: date.
Caused by error in `as.Date()`:
! unused argument (tz = "Asia/Dhaka")
Run `rlang::last_trace()` to see where the error occurred.

But this does:

> my_table %>% collect %>% mutate(date = as.Date(created_at, tz="Asia/Dhaka")) %>% head %>% pull(date)
[1] "2023-09-10" "2023-11-30" "2023-12-03" "2024-04-13" "2023-12-07" "2024-04-27"

That leads me to believe that dbplyr doesn’t know how to use the "tz" argument of as.Date.

Can I get the database to return dates in a given timezone, or do I have to collect all my datasets?

I’ve seen mention of timezone_out for Postgres’s dbConnect, but I haven’t been able to get it to do what I think it should.

2

Answers


  1. Chosen as BEST ANSWER

    It worked for me to set the time zone on the connection, as from this answer.

    con <- DBI::dbConnect(RPostgres::Postgres(), dbname = dbname)
    # Setting the timezone will make as.Date work in this time zone
    DBI::dbExecute(con, paste0("SET TIME ZONE 'America/Chicago'"))
    my_table <- tbl(con, "my_table")
    my_table %>% mutate(date = as.Date(created_at)) %>% head %>% select(created_at, date)
    # Source:   SQL [6 x 2]
    # Database: postgres  [me@mydb]
      created_at          date      
      <dttm>              <date>    
    1 2024-03-06 20:07:25 2024-03-06
    2 2024-05-18 22:35:13 2024-05-18
    3 2023-11-28 18:02:03 2023-11-28
    4 2024-04-23 18:20:55 2024-04-23
    5 2023-12-01 00:45:38 2023-11-30
    6 2023-07-10 19:17:10 2023-07-10
    

    Line 5 shows a difference between UTC date (on the left) and America/Chicago date (on the right). America/Chicago is currently -0600, so I'd expect the date to be one day earlier, which it is.

    ===

    More detail on why I think timezone_out won't work: it does something, but I don't think I can get it to display the right date without collect.

    con <- DBI::dbConnect(RPostgres::Postgres(), dbname = dbname, timezone_out="America/Chicago")
    my_table <- tbl(con, "my_table")
    my_table %>% mutate(date = as.Date(created_at)) %>% head %>% select(created_at, date)
    # Source:   SQL [6 x 2]
    # Database: postgres  [me@mydb]
      created_at          date      
      <dttm>              <date>    
    1 2024-03-06 14:07:25 2024-03-06
    2 2024-05-18 17:35:13 2024-05-18
    3 2023-11-28 12:02:03 2023-11-28
    4 2024-04-23 13:20:55 2024-04-23
    5 2023-11-30 18:45:38 2023-12-01
    6 2023-07-10 14:17:10 2023-07-10
    

    It's displaying created_at in my laptop timezone ("CST"):

    > my_table %>% head(1) %>% pull(created_at)
    [1] "2024-03-06 14:07:25 CST"
    

    but it's the same UTC timestamp:

    > my_table %>% head(1) %>% pull(created_at) %>% as.POSIXct(tz="UTC")
    [1] "2024-03-06 20:07:25 UTC"
    

    However, I think date field is the UTC date (see line 5, which should be 2023-11-30 but is 2023-12-01), and I can't shift it without collect.


  2. You may find the timezone argument to be useful. See here. You may want to check the data types in the underlying PostgreSQL table. Things will be much more predictable if this is (or is cast to) TIMESTAMP WITH TIME ZONE (perhaps using ::TIMESTAMPTZ?).

    conn <- dbConnect(Postgres(),
                      host = "dbhost",
                      user = "dbuser",
                      dbname = "dbname",
                      timezone = "America/Chicago"
    )
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search