skip to Main Content

I have a table ‘Rolls’ in Postgres where a column ‘A’ stores 10000 records each of type SMALLINT.
I am looking to write a SQL query which will take the last 100 rows of ‘Rolls’ and then take the first order difference of the column and then will return the average.

For example:-

    Rolls 
|  A  | ---- |
   1
   2
   3
   5
   7

The first order difference of the last 4 rows of A is equal
[2, 2 , 1] or [(7-5), (5 – 3), (3 – 2)] The final answer will be (2+2+1)/4 or 1.25

Is this possible to be executed in Postgresql?

3

Answers


  1. CREATE TABLE Rolls (
        "A" SMALLINT
    );
    
    
    INSERT INTO Rolls ("A") VALUES    
     (1),
     (4),
     (7),
     (9),
     (10),
     (12),
     (15),
     (20),
     (23)
    ;
    

    Without calculating the average value. Used by LAG to take the previous row.

    SELECT *,
    "A" - LAG("A")  OVER (ORDER BY "A") AS diff 
    FROM Rolls
    ;
    

    Output

     A  | diff
    ----+------
      1 |
      4 |    3
      7 |    3
      9 |    2
     10 |    1
     12 |    2
     15 |    3
     20 |    5
     23 |    3
    

    The first subquery considers the difference in "A". The second selects the last five rows. And the external query considers the average.

    The ORDER BY "A" DESC sort is used to sort the rows in the desired order and take the desired five rows LIMIT 5.

    SELECT AVG(diff)
    FROM
    (SELECT *
    FROM
    (SELECT *,
    "A" - LAG("A")  OVER (ORDER BY "A") AS diff 
    FROM Rolls) t
    ORDER BY "A" DESC
    LIMIT 5
    ) y
    ;
    

    Output

            avg
    --------------------
     2.8000000000000000
    
    Login or Signup to reply.
  2. First cte to add IDs using the window function row_number(), cte2 to get last n rows, then we calculate the differences using LEAD, then latest step we calculate the average using sum(diff)/count(1):

    with cte as (
      select A, row_number() over () as row_id
      from Rolls
    ),
    cte2 as (
      select *
      from cte
      order by row_id desc
      limit 4 
    ),
    cte3 as (
      select A, A - LEAD(A) OVER(ORDER BY row_id desc) AS diff
      from cte2
    )
    select ROUND(cast(SUM(diff) as decimal)/count(1), 2)
    from cte3
    

    Demo here

    Login or Signup to reply.
  3. If we apply a little bit of math, I think we can simplify the calculation.

    You have values a1, a2, …, an-1, an

    The first order difference gives of the last x terms give:

    an-an-1, an-1-an-2, …, an-(x-3)-an-(x-2), an-(x-2)-an-(x-1)

    The sum of which is:

    an-an-1 + an-1-an-2 + … + an-(x-3)-an-(x-2) + an-(x-2)-an-(x-1)

    As you can see almost all terms appear once as a positive and once as a negative, so the whole thing just becomes

    an-an-(x-1)

    So the value you actually want becomes:
    (an-an-(x-1))/x

    With SQL and x=4 this becomes

    select 
        (
            a - lead(a,(4 - 1)) over (order by id desc)
        )/4.0 avg
    from rolls
    order by id desc
    limit 1
    

    This assumes an id column, for specifying the order.
    If a is to be taken in descending order, just replace all occurences of id with a.

    And of course, for x=100 you should replace 4 with 100.

    This finally also assumes that there are enough row.
    If there aren’t you have to come up with your own rules, what should happen.

    db-fiddle to play with.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search