skip to Main Content

I have a table with name_id, date and col1-col5 with rates.
I’m looking for a solution to compare the average of the rates of the latest day with the average of the rates of the day before the latest day.
When the average of the latest day is higher then the day before the latest day, then ‘up’. When lower, then ‘down’.

table

The latest day:

SELECT AVG(CASE
         WHEN COALESCE(`col1`, `col2`, `col3`, `col4`, `col5`) IS NOT NULL THEN 
         (COALESCE(`col1`, 0) + COALESCE(`col2`, 0) + COALESCE(`col3`, 0) + COALESCE(`col4`, 0) + COALESCE(`col5`, 0) ) / 
            (CASE WHEN `col1` IS NULL THEN 0 ELSE 1 END + 
             CASE WHEN `col2` IS NULL THEN 0 ELSE 1 END +
             CASE WHEN `col3` IS NULL THEN 0 ELSE 1 END +
             CASE WHEN `col4` IS NULL THEN 0 ELSE 1 END +
             CASE WHEN `col5` IS NULL THEN 0 ELSE 1 END)
       END) AS Average
       FROM table WHERE `date`= (SELECT MAX(`date`) FROM table) AND name_id=3;

Average: 4.9028

The day before the latest day:

SELECT AVG(CASE
         WHEN COALESCE(`col1`, `col2`, `col3`, `col4`, `col5`) IS NOT NULL THEN 
         (COALESCE(`col1`, 0) + COALESCE(`col2`, 0) + COALESCE(`col3`, 0) + COALESCE(`col4`, 0) + COALESCE(`col5`, 0) ) / 
            (CASE WHEN `col1` IS NULL THEN 0 ELSE 1 END + 
             CASE WHEN `col2` IS NULL THEN 0 ELSE 1 END +
             CASE WHEN `col3` IS NULL THEN 0 ELSE 1 END +
             CASE WHEN `col4` IS NULL THEN 0 ELSE 1 END +
             CASE WHEN `col5` IS NULL THEN 0 ELSE 1 END)
       END) AS Average
       FROM table WHERE `date` = (SELECT MAX(date) FROM table WHERE date < (SELECT MAX(date) FROM table)) AND name_id=3;

Average: 5.3200

This query only compares the rates of the latest day with a ‘fixxed’ rate of 5.00 (not what I need):

SELECT CASE
WHEN AVG(CASE
         WHEN COALESCE(`col1`, `col2`, `col3`, `col4`, `col5`) IS NOT NULL THEN 
         (COALESCE(`col1`, 0) + COALESCE(`col2`, 0) + COALESCE(`col3`, 0) + COALESCE(`col4`, 0)+ COALESCE(`col5`, 0) ) / 
            (CASE WHEN `col1` IS NULL THEN 0 ELSE 1 END + 
             CASE WHEN `col2` IS NULL THEN 0 ELSE 1 END +
             CASE WHEN `col3` IS NULL THEN 0 ELSE 1 END +
             CASE WHEN `col4` IS NULL THEN 0 ELSE 1 END +
             CASE WHEN `col5` IS NULL THEN 0 ELSE 1 END)
       END) < 5.00 THEN 'up' ELSE 'down' END AS 'Rate'
       FROM table WHERE `date`= (SELECT MAX(`date`) FROM table) AND name_id=3;

Rate: up

How to make a query to compare the 2 outputs?

2

Answers


  1. 1 possible way is to calculate the average first, Join and compare the result then –

    WITH latest_day_avg AS (
    SELECT name_id,
        AVG(CASE
                WHEN COALESCE(`col1`, `col2`, `col3`, `col4`, `col5`) IS NOT NULL 
                    THEN (COALESCE(`col1`, 0) + COALESCE(`col2`, 0) + COALESCE(`col3`, 0) + COALESCE(`col4`, 0) + COALESCE(`col5`, 0) ) / 
                         (CASE WHEN `col1` IS NULL THEN 0 ELSE 1 END + 
                          CASE WHEN `col2` IS NULL THEN 0 ELSE 1 END +
                          CASE WHEN `col3` IS NULL THEN 0 ELSE 1 END +
                          CASE WHEN `col4` IS NULL THEN 0 ELSE 1 END +
                          CASE WHEN `col5` IS NULL THEN 0 ELSE 1 END)
                      END) AS Average
    FROM 
        table 
    WHERE 
        `date` = (SELECT MAX(`date`) FROM table)
    GROUP BY name_id
    ),
    prev_day_avg AS(
    SELECT name_id,
        AVG(CASE
             WHEN COALESCE(`col1`, `col2`, `col3`, `col4`, `col5`) IS NOT NULL THEN 
             (COALESCE(`col1`, 0) + COALESCE(`col2`, 0) + COALESCE(`col3`, 0) + COALESCE(`col4`, 0) + COALESCE(`col5`, 0) ) / 
                (CASE WHEN `col1` IS NULL THEN 0 ELSE 1 END + 
                 CASE WHEN `col2` IS NULL THEN 0 ELSE 1 END +
                 CASE WHEN `col3` IS NULL THEN 0 ELSE 1 END +
                 CASE WHEN `col4` IS NULL THEN 0 ELSE 1 END +
                 CASE WHEN `col5` IS NULL THEN 0 ELSE 1 END)
           END) AS Average
    FROM 
        table 
    WHERE 
        `date` = (SELECT MAX(date) FROM table 
                  WHERE date < (SELECT MAX(date) FROM table)) 
    GROUP BY name_id
    )
    SELECT CASE WHEN lda.average < pda.average THEN 'Down'
                WHEN lda.average > pda.average THEN 'Up'
                ELSE 'Equal'
           END
      FROM latest_day_avg lda
      JOIN prev_day_avg pda ON lda.name_id = pda.name_id
     WHERE name_id = 3;
    
    Login or Signup to reply.
  2. My understanding is that, where there are multiple rows for the same name_id/date pair, the columns col1 – col5 across the multiple rows form a single set. If my understanding is incorrect, you can completely ignore my answer 😉

    This is a contrived example to highlight the importance of understanding the difference between the average of all values, and the average of some intermediate averages (where the number of values in the intermediates varies).

    Let’s take this modified version of your sample data:

    id name_id date col1 col2 col3 col4 col5
    1 3 2023-07-28 4.64 4.70 4.80 4.90 5.10
    2 3 2023-07-28 4.75 4.86 4.87 5.05 5.15
    3 3 2023-07-28 6.30 NULL NULL NULL NULL
    4 3 2023-07-29 5.00 5.07 5.15 5.29 5.52
    5 3 2023-07-29 5.09 NULL 5.22 5.37 5.52
    6 3 2023-07-29 5.30 5.30 5.45 5.50 5.70

    And here is the same data with the sum, value count and average per row added:

    id name_id date col1 col2 col3 col4 col5 sum val_count avg
    1 3 2023-07-28 4.64 4.70 4.80 4.90 5.10 24.14 5 4.828000
    2 3 2023-07-28 4.75 4.86 4.87 5.05 5.15 24.68 5 4.936000
    3 3 2023-07-28 6.30 NULL NULL NULL NULL 6.30 1 6.300000
    4 3 2023-07-29 5.00 5.07 5.15 5.29 5.52 26.03 5 5.206000
    5 3 2023-07-29 5.09 NULL 5.22 5.37 5.52 21.20 4 5.300000
    6 3 2023-07-29 5.30 5.30 5.45 5.50 5.70 27.25 5 5.450000

    Now let’s look at the average of the averages vs the average of all values for each name_id/date grouping:

    name_id date avg_avg avg
    3 2023-07-28 5.3546666667 5.010909
    3 2023-07-29 5.3186666667 5.320000

    The avg_avg value has decreased between 2023-07-28 and 2023-07-28,
    but the avg value has increased for the same period.

    Modifying Ankit’s answer to compare the averages, instead of the averages of intermediate averages, we end up with:

    WITH latest_day_avg AS (
        SELECT name_id,
            SUM(IFNULL(col1, 0) + IFNULL(col2, 0) + IFNULL(col3, 0) + IFNULL(col4, 0) + IFNULL(col5, 0))
            /
            SUM((`col1` IS NOT NULL) + (`col2` IS NOT NULL) + (`col3` IS NOT NULL) + (`col4` IS NOT NULL) + (`col5` IS NOT NULL)) AS Average
        FROM tbl 
        WHERE `date` = (SELECT MAX(`date`) FROM tbl)
        GROUP BY name_id
    ),
    prev_day_avg AS(
        SELECT name_id,
            SUM(IFNULL(col1, 0) + IFNULL(col2, 0) + IFNULL(col3, 0) + IFNULL(col4, 0) + IFNULL(col5, 0))
            /
            SUM((`col1` IS NOT NULL) + (`col2` IS NOT NULL) + (`col3` IS NOT NULL) + (`col4` IS NOT NULL) + (`col5` IS NOT NULL)) AS Average
        FROM tbl 
        WHERE `date` = (SELECT MAX(date) FROM tbl WHERE date < (SELECT MAX(date) FROM tbl))
        GROUP BY name_id
    )
    SELECT CASE WHEN lda.average < pda.average THEN 'Down'
                WHEN lda.average > pda.average THEN 'Up'
                ELSE 'Equal'
           END AS Rate
      FROM latest_day_avg lda
      JOIN prev_day_avg pda ON lda.name_id = pda.name_id
     WHERE lda.name_id = 3;
    

    Here’s a db<fiddle.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search