skip to Main Content

I’m currently using AWS Athena and I’m looking to convert a string field ‘YYYYMM’ to date (i.e ‘202208’ to 2022-08-dd) and define the date (dd) as the last day in that month. I would first convert the string to date then I need to apply a function over the top to convert it to the last day of that month (usually by default the date would be set to the 1st of the month).

I was looking in the Presto documents (https://prestodb.io/docs/current/functions/datetime.html) and came across the section for the MySQL Str_to_date function, however, I receive errors when trying to run the below in Athena.

SELECT 
MONTH,
STR_TO_DATE(CONCAT(MONTH,'01'),'%Y%m%d') 
FROM db.table 
WHERE MONTH IN ('202208')

The error message on Athena is that the function isn’t registered 'SYNTAX_ERROR: line 3:1: Function str_to_date not registered'.

I have now used the date_parse function which does the conversion, however, there is a timestamp in the output. Although it won’t impact how I then run reports on the back of this is there a way to remove the timestamp?

SELECT 
MONTH,
date_parse((Month),'%Y%m') 
FROM db.table 
WHERE MONTH IN ('202208')

Output for the above is 2022-08-01 00:00:00.000

My next step is to convert this output date to the last day of the month. I tried to apply the last day of the month function but that isn’t supported. What would the MySQL alternative be to this? Is there a function that supports this or should I do the math with a date trunc?

SELECT 
MONTH,
last_day_of_month(date_parse((Month),'%Y%m'))
FROM db.table 
WHERE MONTH IN ('202208')

Thanks

2

Answers


  1. One trick would be to parse your text dates into the first day of that month. Then, add one month and subtract one day to get the last day of that month.

    WITH yourTable AS (
        SELECT '202208' AS MONTH
    )
    
    SELECT
        MONTH,
        DATE_ADD('day',
                 -1,
                 DATE_ADD('month',
                          1,
                          DATE_PARSE(CONCAT(MONTH, '01'), '%Y%m%d'))) AS LAST_DAY
    FROM yourTable;
    
    Login or Signup to reply.
  2. Similar solution in spirit to what is proposed in the other answer, but with less nesting.

    SELECT 
    DATE_PARSE(dt, '%Y%m') + INTERVAL '1' month - INTERVAL '1' day last_day_of_month
    FROM (SELECT '202208' dt) t
    
    -- outputs 
    last_date
    2022-08-31 00:00
    

    DATE_PARSE(<field>, '%Y%m') is a valid date format in athena and will parse into the first date of the month.

    Adding an interval '1' month and then removing interval '1' day yields the last date of the month. You could remove any other shorter interval, say, if you removed '1' second, you’d end up with the time 2022-08-31 23:59:59.000.

    I have now used the date_parse function which does the conversion, however, there is a timestamp in the output. Although it won’t impact how I then run reports on the back of this is there a way to remove the timestamp?

    Yes. You can convert the timestamp to a date easily by passing the value to the DATE function. See below.

    SELECT 
    DATE(DATE_PARSE(dt, '%Y%m') + INTERVAL '1' month - INTERVAL '1' day) last_day_of_month
    FROM (SELECT '202208' dt) t
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search