I have an Athena table with a string
column which looks like:
+-------------------+
| employee_size |
+-------------------+
| GREATER THAN 2000 |
+-------------------+
| 500 - 999 |
+-------------------+
| 28.00 |
+-------------------+
| unknown |
+-------------------+
| 563 |
+-------------------+
I want to convert the column values into integer
if possible, else it should be null
. So, the desired output should look like:
+---------------+
| employee_size |
+---------------+
| |
+---------------+
| |
+---------------+
| 28 |
+---------------+
| |
+---------------+
| 563 |
+---------------+
I tried to use combination of queries the closest I think I could get was:
SELECT
CASE
WHEN employee_size LIKE '% %' THEN NULL
WHEN employee_size LIKE '%-%' THEN NULL
WHEN regexp_like(employee_size,'([A-Za-z]') THEN NULL
WHEN employee_size LIKE '%.%' THEN CAST(employee_size AS decimal)
ELSE CAST(employee_size AS integer)
END AS employee_size
FROM
"table_name";
But this code results in an error:
INVALID_FUNCTION_ARGUMENT: end pattern with unmatched parenthesis
If you guys have faced something similar, please suggest a solution.
EDIT: I forgot to mention that if there is decimal
value like 28.00
or 5.64
it should ignore whatever is there after the decimal and only have 28
or 5
2
Answers
You could try the following logic:
Are you looking for something like that?