I am trying to write a SQL query to get the month end prices for a particular instrument.
Table (contains some other columns that can be ignored):
instrumentId | unitPrice | reportedAt |
---|---|---|
1 | 90.22 | 2018-10-28 |
1 | 80.55 | 2018-10-30 |
1 | 70.55 | 2018-09-15 |
1 | 70.56 | 2018-09-23 |
Expected result:
instrumentId | unitPrice | reportedAt |
---|---|---|
1 | 80.55 | 2018-10-30 |
1 | 70.56 | 2018-09-23 |
There will be multiple rows for the same instrumentId
with different reportedAt
from different months. The idea is for each month and year (e.g. Oct 2018), I am trying to get only the last price reportedAt
for each month.
SELECT
"instrumentId",
"unitPrice",
MAX("reportedAt") as reportedAt
FROM
"InstrumentPrice"
WHERE
"instrumentId" = 90
GROUP BY
EXTRACT(MONTH FROM "reportedAt"),
EXTRACT(YEAR FROM "reportedAt")
ORDER BY
"reportedAt" DESC
However, this query keeps throwing an error saying the columns selected must appear in GROUP BY
clause or be used in an aggregate function.
This will not work because if I group by unitPrice
or reportedAt
, I will not get the desired results (because they are all unique).
2
Answers
The error in your query is caused by including the column
unitPrice
andinstrumentId
in theSELECT
clause without including it in an aggregate function or in theGROUP BY
clause.In SQL, when you use
GROUP BY
, you should either:You can achieve this using window functions. You can use the
ROW_NUMBER()
window function to assign a row number to each row within each month and year partition based on thereportedAt
column, and then filter for rows where the row number is 1 (indicating the last price reported for each month). Here’s how you can do it:In this query:
ROW_NUMBER()
window function. ThePARTITION BY
clause partitions the data byinstrumentId
, month, and year, and orders the rows within each partition byreportedAt
in descending order.group by
is not the right tool for your job. Here it is using window functionrow_number()
.Demo