Table 1 and sample data:-
CREATE TABLE student_p (
ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
S_id INT UNSIGNED NOT NULL,
Points DOUBLE NOT NULL,
P_date DATE NOT NULL
);
INSERT INTO student_p VALUES
(50055, 3330, 45, '2023-11-30'),
(50056, 332, 43, '2013-10-31'),
(50057, 3330, 22, '2013-10-30');
Table 2 and sample data: –
CREATE TABLE student_act (
ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
s_id INT UNSIGNED NOT NULL,
VIDEO_SCORE DOUBLE NOT NULL,
EXERCISESCORE DOUBLE NOT NULL,
DS INT NOT NULL,
A_date DATE NOT NULL
);
INSERT INTO student_act VALUES
(2333, 233, 22.43, 233.4455, 23, '2023-11-30'),
(2334, 235, 24.566, 232, 34, '2023-10-31'),
( 322, 678, 23, 45, 23, '2022-10-30'),
( 433, 45, 23, 23, 43, '2022-10-01');
Only ids in both table can’t be duplicate rest all data can be duplicate.
student_p table: –
ID | S_id | Points | P_date |
---|---|---|---|
50055 | 3330 | 45 | 2023-11-30 |
50056 | 332 | 43 | 2013-10-31 |
50057 | 3330 | 22 | 2013-10-30 |
student_act table: –
ID | s_id | VIDEO_SCORE | Lesson_score | DS | A_date |
---|---|---|---|---|---|
2333 | 233 | 22.43 | 233.4455 | 23 | 2023-11-30 |
2334 | 235 | 24.566 | 232 | 34 | 2023-10-31 |
322 | 678 | 23 | 45 | 23 | 2022-10-30 |
433 | 45 | 23 | 23 | 43 | 2022-10-01 |
desired result: –
year | month | points | VIDEO_SCORE | Lesson_score | DS |
---|---|---|---|---|---|
2023 | 11 | 45 | 22.43 | 233.4455 | 23 |
2023 | 10 | null | 24.566 | 232 | 34 |
2022 | 10 | null | 46 | 68 | 66 |
2013 | 10 | 65 | null | null | null |
4
Answers
For starters, you can try to index your DATE field on the student_act table.
Indexing helps the SQL engine to find records based on that field faster.
MySql:
Indexing optimises the lookup process in SQL engines, usually very effective if the query matches records based on MULTIPLE fields, or like in your example, the matched field is non-numeric.
Also, try to rename your field to something different than DATE, as it is a reserved word in some SQL engines.
In your second query you are restricting the result set by the year. If that is what you want to do, you should do it based on the date, not a function on the date, as that is non-sargable. This will then be able to use the index suggested by Hossam. Putting it (overly) simplistically, the
HAVING
clause is applied after doing all the work, whereas theWHERE
clause reduces the amount of work being done:MAKEDATE(year,dayofyear)
In one of your comments you stated that you wanted a
full join
, which MySQL does not have but you can achieve the same with a(LEFT|RIGHT) JOIN
UNION
ed to another(LEFT|RIGHT) JOIN
.Output:
The two CTEs do the aggregation. I left the
WHERE
clauses in but commented out.The first query in the
UNION
is your currentRIGHT JOIN
retrieving all records from the right hand table, whether or not they have an associated row in the left hand table. The second query then gets all the rows from the left hand table which do not have an associated row in the right hand side.Here’s a db<>fiddle with invalid dates and duplicate PK values tweaked.
For MySQL 5.7 you will have to repeat the subqueries within your main query, as CTEs are not available to you:
Output:
db<>fiddle
Is this what you are looking for?
The second query in the
UNION
could be replaced with:You’ll have to experiment with these query variations and your real data to see what works best. You should throw O. Jones’ suggestion into the mix too, to see if it improves the performance of the aggregate queries.
db<>fiddle
If you have performance issues with this query, please update your question including the EXPLAIN output for the full query, and for the two aggregation subqueries.
You can also try this
Instead of using the YEAR() and MONTH() functions, try grouping by LAST_DAY(). It gives you the last day of the month containing any DATE, DATETIME, or TIMESTAMP.
Like this:
Create a covering index on
student_p(date, POINT)
and another onstudent_act(DATE, VIDEO_SCORE, EXERCISES_SCORE, DS)
.That form of the query and those indexes will help a lot.