skip to Main Content

I want to get moving sum and moving average on each date for last 7 days (including current day). I used WINDOW function and used ROWS BETWEEN to frame the function which calculates correctly, but it shows sum and avg for even the first date. I want to enforce my query to show ‘NULL’ unless complete frame of 7 days is available.

Following is table, my query, output, and expected output.

Customer Table:

CREATE TABLE Customer (
    customer_id INT NOT NULL,
    visited_on DATE NOT NULL,
    amount DECIMAL(10, 2) NOT NULL
);

INSERT INTO Customer (customer_id, visited_on, amount) VALUES
(1, '2019-01-01', 100.00),
(2, '2019-01-02', 110.00),
(3, '2019-01-03', 120.00),
(4, '2019-01-04', 130.00),
(5, '2019-01-05', 110.00),
(6, '2019-01-06', 140.00),
(7, '2019-01-07', 150.00),
(8, '2019-01-08', 80.00),
(9, '2019-01-09', 110.00),
(1, '2019-01-10', 130.00),
(3, '2019-01-10', 150.00);

MySQL query:

WITH DAILY_REVENUE AS
    (SELECT
         visited_on,
         SUM(amount) AS amount
    FROM
        Customer
    GROUP BY
        visited_on
    ORDER BY
        visited_on ASC
)  -- SELECT * FROM DAILY_REVENUE

, MOVING_AVG AS(
    SELECT
        visited_on,
        SUM(amount) OVER(ORDER BY visited_on ASC
                        ROWS BETWEEN 6 PRECEDING
                        AND
                        CURRENT ROW) AS amount,
        CAST(AVG(amount) OVER(ORDER BY visited_on ASC
                        ROWS BETWEEN 6 PRECEDING
                        AND
                        CURRENT ROW) AS DECIMAL(5,2)) AS average_amount
    FROM
        DAILY_REVENUE
        
)   SELECT * FROM MOVING_AVG

Output:

visited_on amount average_amount
2019-01-01 100 100
2019-01-02 210 105
2019-01-03 330 110
2019-01-04 460 115
2019-01-05 570 114
2019-01-06 710 118.33
2019-01-07 860 122.86
2019-01-08 840 120
2019-01-09 840 120
2019-01-10 1000 142.86

Expected Output:

visited_on amount average_amount
2019-01-01 NULL NULL
2019-01-02 NULL NULL
2019-01-03 NULL NULL
2019-01-04 NULL NULL
2019-01-05 NULL NULL
2019-01-06 NULL NULL
2019-01-07 860 122.86
2019-01-08 840 120
2019-01-09 840 120
2019-01-10 1000 142.86

2

Answers


  1. You can just do a windowed running count and check that it’s >= 7.

    You can also simplify this by using the named WINDOW clause and shortening it to just ROWS 6 PRECEDING.

    WITH DAILY_REVENUE AS (
        SELECT
             visited_on,
             SUM(amount) AS amount
        FROM
            Customer
        GROUP BY
            visited_on
    )  -- SELECT * FROM DAILY_REVENUE
    
    , MOVING_AVG AS (
        SELECT
            visited_on,
            CASE WHEN COUNT(*) OVER w >= 7
                 THEN SUM(amount) OVER w
              END AS amount,
            CASE WHEN COUNT(*) OVER w >= 7
                 THEN CAST(AVG(amount) OVER w AS DECIMAL(5,2))
              END AS average_amount
        FROM
            DAILY_REVENUE
        WINDOW w AS (ORDER BY visited_on ROWS 6 PRECEDING)
    )
    SELECT *
    FROM MOVING_AVG
    ORDER BY
        visited_on ASC;
    

    db<>fiddle

    Login or Signup to reply.
  2. You can also use a self-join so that for each row(the current day), we only join it with dates that are within the last 7 days.

    Fiddle

    WITH daily_totals AS (
        SELECT 
            visited_on,
            SUM(amount) AS total_amount
        FROM Customer
        GROUP BY visited_on
    )
    
    SELECT
        a.visited_on,
        CASE 
            WHEN COUNT(b.visited_on) = 7 THEN SUM(b.total_amount)
            ELSE NULL
        END AS amount,
        CASE 
            WHEN COUNT(b.visited_on) = 7 THEN ROUND(AVG(b.total_amount), 2)
            ELSE NULL
        END AS average_amount
    FROM daily_totals a
    LEFT JOIN daily_totals b
        ON b.visited_on BETWEEN a.visited_on - INTERVAL 6 DAY AND a.visited_on
    GROUP BY a.visited_on
    ORDER BY a.visited_on;
    

    Output

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search