I’ve got two tables:
utilities
:
id | timestamp | action |
---|---|---|
901 | 2024-08-11 09:59:25.000 | on power |
902 | 2024-08-11 09:59:35.000 | on water |
903 | 2024-08-11 09:59:55.000 | off power |
904 | 2024-08-11 10:01:25.000 | on gas |
905 | 2024-08-11 10:02:35.000 | off water |
906 | 2024-08-11 10:11:18.000 | off power |
907 | 2024-08-11 10:31:28.000 | off gas |
908 | 2024-08-11 11:15:37.000 | on power |
items
:
id | timestamp | action |
---|---|---|
906 | 2024-08-11 09:59:45.000 | on lights |
907 | 2024-08-11 09:59:58.000 | off lights |
908 | 2024-08-11 10:15:34.000 | on tap |
909 | 2024-08-11 10:18:25.000 | on heating |
910 | 2024-08-11 10:21:44.000 | off heating |
911 | 2024-08-11 11:02:35.000 | off tap |
912 | 2024-08-11 12:01:08.000 | open door |
913 | 2024-08-11 12:11:28.000 | closer door |
I’m trying to combine them, without the ids clashing (generate a new id?) and then want to do a date_trunc('hour', timestamp) as time, COUNT(*) as metric
to work out how many actions are happening in the table per hour, so the end result would be:
time | metric |
---|---|
2024-08-11 09:00:00.000 | 5 |
2024-08-11 10:00:00.000 | 7 |
2024-08-11 11:00:00.000 | 2 |
2024-08-11 12:00:00.000 | 2 |
I tried this query, but it complains about:
Utilities.timestamp must appear in the GROUP BY …
WITH one AS (
SELECT
date_trunc('hour', timestamp) as timeOne,
COUNT(*) as utilities_count
FROM Utilities
ORDER BY timeOne
),
two AS (
SELECT
date_trunc('hour', timestamp) as timeTwo,
COUNT(*) as item_count
FROM Items
ORDER BY timeTwo
)
SELECT
SUM(utilities_count, item_count) as metric,
timeOne as time
FROM one, two
ORDER BY 1;
Any idea how to get the data combined and counted with the correct hourly binning?
2
Answers
Simply:
Clashing IDs are not relevant to this query. Simply append rows from both tables with
UNION ALL
(not justUNION
!), usedate_trunc()
and aggregate.I removed a more complex query that would only make sense with (unlikely) expression indexes on underlying tables.
count
is an aggregate function which combines the values of multiple rows together. You have to tell Postgres how to group rows with agroup by
clause. You want to group by hour, soGROUP BY date_trunc('hour', timestamp)
orGROUP BY 1
.Also, you do not want to do a cross join which will result in each row in one being joined with every row in two. Instead you can union both tables together and then group, but this can’t take advantage of indexes. Instead, count both tables (which can use an index), then union and group and sum the counts together (only a handful of rows).
Other notes…
timestamp
. They risk errors and don’t have a lot of descriptive power. Follow the thing_at for timestamps and thing_on for dates.