In a PostgreSQL have the table Sales
containing the following data:
| id(PK) | eva_dts_id(FK) | realm_name | device_id | timestamp | product_id | product_name | quantity | product_price | revenue |
realm_name
and device_id
are constants and even if they changed in the future they would be updated accordingly on this table.
I’m trying to create the view Products
on it with the following schema:
| realm_name | device_id | product_id | product_name | product_price |
So far I used the following query:
SELECT DISTINCT realm_name, device_id, product_id, product_name, product_price
FROM public.Sales;
This works assuming the product list of a device is immutable and so won’t change in the future. I’d like to release that assumption and create a product list data for a device considering always the most recent record (based on the timestamp
found on the Sales
table). So for instance if I have the product "Chocomilk" that had the records:
| id(PK) | eva_dts_id(FK) | realm_name | device_id | timestamp | product_id | product_name | quantity | product_price | revenue |
"03ef91f6-bb24-4c8e-90ef-366cc4dee5a6" "e853dcec-c369-4111-816d-1645067df8e1" "tenant" "RbbMIyemWTOI99N6XZx1hA" "2023-03-26 22:43:31.454734" "10" "Chocomilk" 1 0.38 0
"03ef91f6-bb24-4c8e-90ef-366cc4dee5a6" "e853dcec-c369-4111-816d-1645067df8e1" "tenant" "RbbMIyemWTOI99N6XZx1hA" "2023-04-12 22:43:31.454734" "10" "Chocomilk" 1 2.3 0
"03ef91f6-bb24-4c8e-90ef-366cc4dee5a6" "e853dcec-c369-4111-816d-1645067df8e1" "tenant" "RbbMIyemWTOI99N6XZx1hA" "2023-05-18 22:43:31.454734" "10" "Chocomilk" 1 1.5 0
I would consider the last record only.
How can I rewrite the view query in order to achieve this?
5
Answers
Proposing myself an alternative solution to @markalex 's answer and following the advice of @tinazmu :
Use
ROW_NUMBER()
to order your items information by timestamp and get only newest. Something along the lines ofCheck carefully
PARTITION BY
clause ofROW_NUMBER()
to contain unique identifiers of item, that don’t change over time.I prefer a common table expression for queries like this, it makes it easier to identify what the data is and how you are filtering it.
Row_number and partition, give the query engine a better idea of what you are actually after. Sql is a declarative language, whenever possible you should describe what you want, not do tricks to lead it to the right answer.
Postgres offers another option: Distinct on clause of select statement. This clause results in keeping on the first row matching the distinct on expression (according to the order by phrase). What you are looking for is: (see demo)
Note: IMHO naming a column
timestamp
is poor practice as it is a SQL reserved word and a Postgres data type (although not Postgres reserved). I have substitutedsale_ts
in above and demo.Another option (might be more intuitive) is to filter out older records using EXISTS function:
This query will make sure there isn’t any newer record for the same product_id