I have a table ‘Rolls’ in Postgres where a column ‘A’ stores 10000 records each of type SMALLINT.
I am looking to write a SQL query which will take the last 100 rows of ‘Rolls’ and then take the first order difference of the column and then will return the average.
For example:-
Rolls
| A | ---- |
1
2
3
5
7
The first order difference of the last 4 rows of A is equal
[2, 2 , 1] or [(7-5), (5 – 3), (3 – 2)]
The final answer will be (2+2+1)/4 or 1.25
Is this possible to be executed in Postgresql?
3
Answers
Without calculating the average value. Used by
LAG
to take the previous row.Output
The first subquery considers the difference in "A". The second selects the last five rows. And the external query considers the average.
The
ORDER BY "A" DESC
sort is used to sort the rows in the desired order and take the desired five rowsLIMIT 5
.Output
First
cte
to add IDs using the window functionrow_number()
,cte2
to get last n rows, then we calculate the differences usingLEAD
, then latest step we calculate the average usingsum(diff)/count(1)
:Demo here
If we apply a little bit of math, I think we can simplify the calculation.
You have values a1, a2, …, an-1, an
The first order difference gives of the last x terms give:
an-an-1, an-1-an-2, …, an-(x-3)-an-(x-2), an-(x-2)-an-(x-1)
The sum of which is:
an-an-1 + an-1-an-2 + … + an-(x-3)-an-(x-2) + an-(x-2)-an-(x-1)
As you can see almost all terms appear once as a positive and once as a negative, so the whole thing just becomes
an-an-(x-1)
So the value you actually want becomes:
(an-an-(x-1))/x
With SQL and x=4 this becomes
This assumes an
id
column, for specifying the order.If
a
is to be taken in descending order, just replace all occurences ofid
witha
.And of course, for x=100 you should replace
4
with100
.This finally also assumes that there are enough row.
If there aren’t you have to come up with your own rules, what should happen.
db-fiddle to play with.