skip to Main Content

Table 1 and sample data:-

CREATE TABLE student_p (
    ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
    S_id INT UNSIGNED NOT NULL,
    Points DOUBLE NOT NULL,
    P_date DATE NOT NULL
);
INSERT INTO student_p VALUES
    (50055, 3330, 45, '2023-11-30'),
    (50056,  332, 43, '2013-10-31'),
    (50057, 3330, 22, '2013-10-30');

Table 2 and sample data: –

CREATE TABLE student_act (
    ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
    s_id INT UNSIGNED NOT NULL,
    VIDEO_SCORE DOUBLE NOT NULL,
    EXERCISESCORE DOUBLE NOT NULL,
    DS INT NOT NULL,
    A_date DATE NOT NULL
);
INSERT INTO student_act VALUES
    (2333, 233, 22.43,  233.4455, 23, '2023-11-30'),
    (2334, 235, 24.566,      232, 34, '2023-10-31'),
    ( 322, 678, 23,           45, 23, '2022-10-30'),
    ( 433,  45, 23,           23, 43, '2022-10-01');

Only ids in both table can’t be duplicate rest all data can be duplicate.

student_p table: –

ID S_id Points P_date
50055 3330 45 2023-11-30
50056 332 43 2013-10-31
50057 3330 22 2013-10-30

student_act table: –

ID s_id VIDEO_SCORE Lesson_score DS A_date
2333 233 22.43 233.4455 23 2023-11-30
2334 235 24.566 232 34 2023-10-31
322 678 23 45 23 2022-10-30
433 45 23 23 43 2022-10-01

desired result: –

year month points VIDEO_SCORE Lesson_score DS
2023 11 45 22.43 233.4455 23
2023 10 null 24.566 232 34
2022 10 null 46 68 66
2013 10 65 null null null

4

Answers


  1. For starters, you can try to index your DATE field on the student_act table.
    Indexing helps the SQL engine to find records based on that field faster.

    MySql:

    ALTER TABLE `student_act` ADD INDEX `date_index` (`DATE`)
    

    Indexing optimises the lookup process in SQL engines, usually very effective if the query matches records based on MULTIPLE fields, or like in your example, the matched field is non-numeric.

    Also, try to rename your field to something different than DATE, as it is a reserved word in some SQL engines.

    Login or Signup to reply.
  2. In your second query you are restricting the result set by the year. If that is what you want to do, you should do it based on the date, not a function on the date, as that is non-sargable. This will then be able to use the index suggested by Hossam. Putting it (overly) simplistically, the HAVING clause is applied after doing all the work, whereas the WHERE clause reduces the amount of work being done:

    SELECT
      YEAR(msa.DATE) AS Year,
      MONTH(msa.DATE) AS Month,
      SUM(msk.POINT) AS Ps,
      SUM(msa.VIDEO_SCORE) AS Video,
      SUM(msa.EXERCISESCORE) AS Lessons,
      SUM(msa.DS) AS DS
    FROM student_p msk
    RIGHT JOIN student_act msa ON msk.DATE = msa.DATE
    WHERE msa.DATE >= MAKEDATE(2023, 1) AND msa.DATE < MAKEDATE(2024, 1)
    GROUP BY `Year`, `Month`;
    

    MAKEDATE(year,dayofyear)

    Returns a date, given year and day-of-year values. dayofyear must be greater than 0 or the result is NULL.


    In one of your comments you stated that you wanted a full join, which MySQL does not have but you can achieve the same with a (LEFT|RIGHT) JOIN UNIONed to another (LEFT|RIGHT) JOIN.

    WITH st (`Year`, `Month`, `Ps`) AS (
        SELECT
            YEAR(`P_date`) AS `Year`,
            MONTH(`P_date`) AS `Month`,
            SUM(`Points`)
        FROM `student_p`
        -- WHERE `P_date` >= MAKEDATE(2023, 1) AND `P_date` < MAKEDATE(2024, 1)
        GROUP BY `Year`, `Month`
    ),
    act (`Year`, `Month`, `Video`, `Lessons`, `DS`) AS (
        SELECT
            YEAR(`A_date`) AS `Year`,
            MONTH(`A_date`) AS `Month`,
            SUM(`VIDEO_SCORE`) AS `Video`,
            SUM(`EXERCISESCORE`) AS `Lessons`,
            SUM(`DS`) AS `DS`
        FROM student_act
        -- WHERE A_date >= MAKEDATE(2023, 1) AND DATE < MAKEDATE(2024, 1)
        GROUP BY `Year`, `Month`
    )
    SELECT
        `Year`,
        `Month`,
        SUM(`Ps`) AS `Ps`,
        SUM(`Video`) AS `Video`,
        SUM(`Lessons`) AS `Lessons`,
        SUM(`DS`) AS `DS`
    FROM (
        SELECT `act`.`Year`, `act`.`Month`, `Ps`, `Video`, `Lessons`, `DS`
        FROM `st`
        RIGHT JOIN `act` ON `st`.`Year` = `act`.`Year` AND `st`.`Month` = `act`.`Month`
    
        UNION ALL
    
        SELECT `st`.`Year`, `st`.`Month`, `Ps`, `Video`, `Lessons`, `DS`
        FROM `st`
        LEFT JOIN `act` ON `st`.`Year` = `act`.`Year` AND `st`.`Month` = `act`.`Month`
        WHERE `act`.`Year` IS NULL
    ) t
    GROUP BY `Year`, `Month`
    ORDER BY `Year` DESC, `Month` DESC;
    

    Output:

    Year Month Ps Video Lessons DS
    2023 11 45 22.43 233.4455 23
    2023 10 NULL 24.566 232 34
    2022 10 NULL 46 68 66
    2013 10 65 NULL NULL NULL

    The two CTEs do the aggregation. I left the WHERE clauses in but commented out.

    The first query in the UNION is your current RIGHT JOIN retrieving all records from the right hand table, whether or not they have an associated row in the left hand table. The second query then gets all the rows from the left hand table which do not have an associated row in the right hand side.

    Here’s a db<>fiddle with invalid dates and duplicate PK values tweaked.


    For MySQL 5.7 you will have to repeat the subqueries within your main query, as CTEs are not available to you:

    SELECT
        `Year`,
        `Month`,
        SUM(`Ps`) AS `Ps`,
        SUM(`Video`) AS `Video`,
        SUM(`Lessons`) AS `Lessons`,
        SUM(`DS`) AS `DS`
    FROM (
        SELECT `act`.`Year`, `act`.`Month`, `Ps`, `Video`, `Lessons`, `DS`
        FROM (
            SELECT
                YEAR(`P_date`) AS `Year`,
                MONTH(`P_date`) AS `Month`,
                SUM(`Points`) AS `Ps`
            FROM `student_p`
            -- WHERE `P_date` >= MAKEDATE(2023, 1) AND `P_date` < MAKEDATE(2024, 1)
            GROUP BY `Year`, `Month`
        ) `st`
        RIGHT JOIN (
            SELECT
                YEAR(`A_date`) AS `Year`,
                MONTH(`A_date`) AS `Month`,
                SUM(`VIDEO_SCORE`) AS `Video`,
                SUM(`EXERCISESCORE`) AS `Lessons`,
                SUM(`DS`) AS `DS`
            FROM student_act
            -- WHERE A_date >= MAKEDATE(2023, 1) AND DATE < MAKEDATE(2024, 1)
            GROUP BY `Year`, `Month`
        ) `act` ON `st`.`Year` = `act`.`Year` AND `st`.`Month` = `act`.`Month`
    
        UNION ALL
    
        SELECT `st`.`Year`, `st`.`Month`, `Ps`, `Video`, `Lessons`, `DS`
        FROM (
            SELECT
                YEAR(`P_date`) AS `Year`,
                MONTH(`P_date`) AS `Month`,
                SUM(`Points`) AS `Ps`
            FROM `student_p`
            -- WHERE `P_date` >= MAKEDATE(2023, 1) AND `P_date` < MAKEDATE(2024, 1)
            GROUP BY `Year`, `Month`
        ) `st`
        LEFT JOIN (
            SELECT
                YEAR(`A_date`) AS `Year`,
                MONTH(`A_date`) AS `Month`,
                SUM(`VIDEO_SCORE`) AS `Video`,
                SUM(`EXERCISESCORE`) AS `Lessons`,
                SUM(`DS`) AS `DS`
            FROM student_act
            -- WHERE A_date >= MAKEDATE(2023, 1) AND DATE < MAKEDATE(2024, 1)
            GROUP BY `Year`, `Month`
        ) `act` ON `st`.`Year` = `act`.`Year` AND `st`.`Month` = `act`.`Month`
        WHERE `act`.`Year` IS NULL
    ) t
    GROUP BY `Year`, `Month`
    ORDER BY `Year` DESC, `Month` DESC;
    

    Output:

    Year Month Ps Video Lessons DS
    2023 11 45 22.43 233.4455 23
    2023 10 NULL 24.566 232 34
    2022 10 NULL 46 68 66
    2013 10 65 NULL NULL NULL

    db<>fiddle

    Is this what you are looking for?

    The second query in the UNION could be replaced with:

        SELECT
            YEAR(`P_date`) AS `Year`,
            MONTH(`P_date`) AS `Month`,
            SUM(`Points`) AS `Ps`,
            NULL AS `Video`,
            NULL AS `Lessons`,
            NULL AS `DS`
        FROM `student_p`
        WHERE NOT EXISTS (
            SELECT 1
            FROM student_act
            WHERE A_date BETWEEN student_p.P_date - INTERVAL (DAY(student_p.P_date) - 1) DAY
                             AND LAST_DAY(student_p.P_date)
        )
        -- AND `P_date` >= MAKEDATE(2023, 1) AND `P_date` < MAKEDATE(2024, 1)
        GROUP BY `Year`, `Month`
    

    You’ll have to experiment with these query variations and your real data to see what works best. You should throw O. Jones’ suggestion into the mix too, to see if it improves the performance of the aggregate queries.

    db<>fiddle

    If you have performance issues with this query, please update your question including the EXPLAIN output for the full query, and for the two aggregation subqueries.

    Login or Signup to reply.
  3. You can also try this

    SELECT
      YEAR(msa.DATE) AS Year,
      MONTH(msa.DATE) AS Month,
      SUM(msk.POINT) AS Ps,
      SUM(msa.VIDEO_SCORE) AS Video,
      SUM(msa.EXERCISESCORE) AS Lessons,
      SUM(msa.DS) AS DS
    FROM student_p msk
    RIGHT JOIN student_act msa ON msk.DATE = msa.DATE
    WHERE msa.DATE between '2023-01-01 00:00:00' AND '2023-12-31 23:59:59'
    GROUP BY MONTH(msa.DATE);
    
    Login or Signup to reply.
  4. Instead of using the YEAR() and MONTH() functions, try grouping by LAST_DAY(). It gives you the last day of the month containing any DATE, DATETIME, or TIMESTAMP.

    Like this:

    SELECT
      LAST_DAY(msa.DATE) AS MonthEnding,
      SUM(msk.POINT) AS Ps,
      SUM(msa.VIDEO_SCORE) AS Video,
      SUM(msa.EXERCISESCORE) AS Lessons,
      SUM(msa.DS) AS DS
    FROM student_p msk
    RIGHT JOIN student_act msa ON msk.DATE = msa.DATE
    WHERE msa.DATE >= MAKEDATE(2023, 1) AND msa.DATE < MAKEDATE(2024, 1)
    GROUP BY LAST_DAY(msa.DATE);
    

    Create a covering index on student_p(date, POINT) and another on student_act(DATE, VIDEO_SCORE, EXERCISES_SCORE, DS).

    That form of the query and those indexes will help a lot.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search