I want to group all entries in my table by their user_type
property.
However the user_type
entries exist in a {type of user} {date account was created}
format, e.g.
id | user_type
--------------
01 | 'admin 2024-03-01'
02 | 'admin 2024-03-01'
03 | 'user 2024-03-02'
04 | 'user 2024-03-02'
05 | 'premium_user 2024-03-03'
I want to get the counts of each type of user, ignoring the date at the end of the value. So for the above example, my query would return:
user_type | count
----------------------
'admin' | 2
'user' | 2
'premium_user' | 1
The user types are subject to change, so I can’t hardcode in filters such as select count(id) where user_type like 'admin %'
.
2
Answers
We can use
SUBSTRING()
with a regular expression here:Here is a working SQL fiddle.
Obviously, you need to split that column into two columns:
user_type text
(or some enumeration type) anddate_created date
.While stuck with your unfortunate design – and if the trailing date has the fixed format your sample data suggests –
left()
with a negative length to truncate the last 11 characters is simplest & cheapest (much cheaper than a regular expression):The solution is for Postgres. You also tagged Redshift, which is not Postgres at all. There,
left()
does not support negative length. You might work around with:About the numeric reference in
GROUP BY 1
andORDER BY 2
:That works in Postgres and Redshift.
fiddle