Need to retrieve a column from a group-by query, and this column is not part of the group-by clause.
I have a table that receives telemetries from equipments:
COLUMNS:
- transmission_tag varchar — this is kind of a transmission package identifier
- equipment_id int
- measurement int — the value measured
- uts int — the timestamp ‘unixtime’ of the measurement
Then I need to get the most recent measurement from every equipment:
SELECT equipment_id, max(uts)
FROM telemetry
GROUP BY equipment_id
-- works OK and quick, the table has ~100mi registers and has index on (equipment_id, uts)
The question is that I need to get the "transmission_tag" related to this registers.
I tried with "WHERE EXISTS" / "WITH auxquery (…) SELECT…", but I was not able to solve this problem.
The example below retrieves exactly what I need (for testing purposes), but it is really dumb. In the real world database, it takes 1 hour to run (it does a sequential scan of all registers).
SELECT transmission_tag
FROM telemetry
WHERE CAST(equipment_id as VARCHAR) || '_' || CAST(uts as VARCHAR) IN
(SELECT CAST(equipment_id as VARCHAR) || '_' || CAST(max(uts) as VARCHAR)
FROM telemetry
GROUP BY equipment_id)
The rows in the table are unique, there is only one row with (equipment_id, uts).
In case the answer is db specific, I’m using Postgres.
3
Answers
Use
inner join
to join your dataset with the table then get transmission_tag :You want to filter the table, not aggregate it. In Postgres, I would recommend
distinct on
, which does exactly what you ask for:This gives you the entire latest row of each equipment:
You can use
distinct on
clause such;The query says order by
equipment_id, uts desc
then get first row for each distinctequipment_id
. Next, you have your row and you can get any field value from that row.