I’m currently using AWS Athena and I’m looking to convert a string field ‘YYYYMM’ to date (i.e ‘202208’ to 2022-08-dd) and define the date (dd) as the last day in that month. I would first convert the string to date then I need to apply a function over the top to convert it to the last day of that month (usually by default the date would be set to the 1st of the month).
I was looking in the Presto documents (https://prestodb.io/docs/current/functions/datetime.html) and came across the section for the MySQL Str_to_date function, however, I receive errors when trying to run the below in Athena.
SELECT
MONTH,
STR_TO_DATE(CONCAT(MONTH,'01'),'%Y%m%d')
FROM db.table
WHERE MONTH IN ('202208')
The error message on Athena is that the function isn’t registered 'SYNTAX_ERROR: line 3:1: Function str_to_date not registered'
.
I have now used the date_parse function which does the conversion, however, there is a timestamp in the output. Although it won’t impact how I then run reports on the back of this is there a way to remove the timestamp?
SELECT
MONTH,
date_parse((Month),'%Y%m')
FROM db.table
WHERE MONTH IN ('202208')
Output for the above is 2022-08-01 00:00:00.000
My next step is to convert this output date to the last day of the month. I tried to apply the last day of the month function but that isn’t supported. What would the MySQL alternative be to this? Is there a function that supports this or should I do the math with a date trunc?
SELECT
MONTH,
last_day_of_month(date_parse((Month),'%Y%m'))
FROM db.table
WHERE MONTH IN ('202208')
Thanks
2
Answers
One trick would be to parse your text dates into the first day of that month. Then, add one month and subtract one day to get the last day of that month.
Similar solution in spirit to what is proposed in the other answer, but with less nesting.
DATE_PARSE(<field>, '%Y%m')
is a valid date format in athena and will parse into the first date of the month.Adding an
interval '1' month
and then removinginterval '1' day
yields the last date of the month. You could remove any other shorter interval, say, if you removed'1' second
, you’d end up with the time2022-08-31 23:59:59.000
.Yes. You can convert the
timestamp
to adate
easily by passing the value to theDATE
function. See below.