skip to Main Content

I have T1 which is formatted as

STUDENT SCORE   DATE
1   6   2022-02-01
1   0   2022-03-12
1   5   2022-04-30
1   1   2022-04-30
1   1   2022-05-14
1   1   2022-05-19
1   8   2022-05-26
2   9   2022-01-02
2   10  2022-04-11
2   2   2022-04-12
2   0   2022-04-17
2   7   2022-05-08
2   4   2022-05-12
3   10  2022-01-09
3   2   2022-02-11
3   6   2022-03-16
3   3   2022-03-18
3   2   2022-04-02
3   9   2022-04-27
4   4   2022-02-24
4   0   2022-02-26
4   9   2022-02-28
4   2   2022-03-27
4   8   2022-04-02
4   4   2022-04-14
5   3   2022-01-28
5   5   2022-02-12
5   6   2022-02-18
5   0   2022-02-21
5   4   2022-04-05
XX  0.711094564 2022-02-28
XX  0.60584994  2022-03-31
XX  0.087965016 2022-04-30
YY  0.497937992 2022-02-28
YY  0.727796963 2022-03-31
YY  0.974471085 2022-04-30
YY  0.780187398 2022-05-30

First I wish to extract the UNIQUE values of DATE when STUDENT equals to XX and YY. There are some duplicates. These are the trimester ending dates.

Then I wish to SUM up the SCORE values for each STUDENT across the date ranges from Step 1:

enter image description here

And together I wish to generate this table T2

STUDENT SCORE   DATE.START  DATE.END
1   6   2022-01-01  2022-02-28
1   0   2022-03-01  2022-03-31
1   6   2022-04-01  2022-04-30
1   10  2022-05-01  2022-05-30
2   9   2022-01-01  2022-02-28
2   NA  2022-03-01  2022-03-31
2   12  2022-04-01  2022-04-30
2   11  2022-05-01  2022-05-30
3   12  2022-01-01  2022-02-28
3   9   2022-03-01  2022-03-31
3   11  2022-04-01  2022-04-30
3   NA  2022-05-01  2022-05-30
4   13  2022-01-01  2022-02-28
4   2   2022-03-01  2022-03-31
4   12  2022-04-01  2022-04-30
4   NA  2022-05-01  2022-05-30
5   14  2022-01-01  2022-02-28
5   NA  2022-03-01  2022-03-31
5   4   2022-04-01  2022-04-30
5   NA  2022-05-01  2022-05-30

And I try this,

WITH T2 AS(
SELECT DISTINCT(DATE),
COALESCE(LEAD(DATE) OVER(ORDER BY DATE) AS DATE.START,
ROW_NUMBER() OVER (ORDER BY DATE) AS DATE.END
FROM T1 WHERE (STUDENT = 'XX' OR STUDENT = 'YY')
)
SELECT STUDENT, SUM(SCORE), DATE.START, DATE.END
FROM T1 JOIN T2 ON T1.DATE.START BETWEEN DATE.START AND DATE.END

4

Answers


  1. Your date ranges are just month boundaries, so you can use ordinary GROUP BY

    SELECT student, DATE_FORMAT(date, '%Y-%m') AS year_month, SUM(score) AS score
    FROM T1
    WHERE student IN ('XX', 'YY')
    GROUP BY student, year_month
    
    Login or Signup to reply.
  2. In this one we’re looking at score by month so we don’t really need the bottom half of the information. In fact, it makes it better not to have it cause then we won’t be able to save student and score as int.

    select    student
             ,sum(score)                              as total_score
             ,max(dateadd(day, 1, eomonth(date, -1))) as "DATE.START"
             ,max(eomonth(date))                      as "DATE.END"
    from      t
    group  by student, year(date), month(date)
    
    student total_score DATE.START DATE.END
    1 6 2022-02-01 2022-02-28
    1 0 2022-03-01 2022-03-31
    1 6 2022-04-01 2022-04-30
    1 10 2022-05-01 2022-05-31
    2 9 2022-01-01 2022-01-31
    2 12 2022-04-01 2022-04-30
    2 11 2022-05-01 2022-05-31
    3 10 2022-01-01 2022-01-31
    3 2 2022-02-01 2022-02-28
    3 9 2022-03-01 2022-03-31
    3 11 2022-04-01 2022-04-30
    4 13 2022-02-01 2022-02-28
    4 2 2022-03-01 2022-03-31
    4 12 2022-04-01 2022-04-30
    5 3 2022-01-01 2022-01-31
    5 11 2022-02-01 2022-02-28
    5 4 2022-04-01 2022-04-30

    Fiddle

    Login or Signup to reply.
  3. Here we create a calendar divided to trimesters and join it with our table matching each date to the correct trimester in the calendar.

    DECLARE @Year INT = '2019';
    DECLARE @YearCnt INT = 4 ;
    DECLARE @StartDate DATE = DATEFROMPARTS(@Year, 01, 01)
    DECLARE @EndDate DATE = DATEADD(DAY, -1, DATEADD(YEAR, @YearCnt, @StartDate));
    
    ;WITH Cal(n) AS
    (
    SELECT 0 UNION ALL SELECT n + 4 FROM Cal
    WHERE n < DATEDIFF(month, @StartDate, @EndDate)
    ),
    FnlDt(d) AS
    (
    SELECT DATEADD(month, n, @StartDate) FROM Cal
    ),
    FinalCte AS
    (
    SELECT
    [Date] = CONVERT(DATE,d)
    
    FROM FnlDt
    )
    select     STUDENT
              ,sum(score) as total_score
              ,semester_start
              ,semester_end
              
    from       (
               select   date                                             as semester_start
                       ,lead(dateadd(day, -1, date)) over(order by date) as semester_end
               from     finalCte 
               ) t2 join t on t.date between semester_start and semester_end   
    group by   STUDENT, semester_start, semester_end
    
               
    
    STUDENT total_score semester_start semester_end
    1 12 2022-01-01 2022-04-30
    1 10 2022-05-01 2022-08-31
    2 21 2022-01-01 2022-04-30
    2 11 2022-05-01 2022-08-31
    3 32 2022-01-01 2022-04-30
    4 27 2022-01-01 2022-04-30
    5 18 2022-01-01 2022-04-30

    Fiddle

    Login or Signup to reply.
  4. This answer is EXACTLY what the question was asking for, including the hacks to accommodate for the ill-posed specifications. Here it goes…

    Select Distinct Date From t1 Where Student = 'XX' or Student = 'YY' Order By date
    
    Select t2.STUDENT as STUDENT,
       COALESCE(CAST(Sum(Case When (t1.DATE is null) Then null
                              When (t1.DATE >= t2.di and t1.DATE <= t2.df) Then t1.SCORE
                              Else 0
            End) AS DECIMAL(2,0)), 'NA') as SCORE, 
       di as "Date.START",
       df as "Date.END"
       From (Select STUDENT, di, df 
                    From (Select Distinct STUDENT From t1 Where STUDENT <> 'XX' and STUDENT <> 'YY') as ta,
                         (Select Distinct Case When MONTH(t1.DATE) = 2 Then DATE_SUB(DATE_SUB(t1.DATE,INTERVAL DAYOFMONTH(t1.DATE) - 1 DAY), INTERVAL 31 DAY)
                                               Else DATE_SUB(t1.DATE,INTERVAL DAYOFMONTH(t1.DATE) - 1 DAY)
                                          End as di,
                                           t1.DATE as df
                                      from t1 where STUDENT = 'XX' or STUDENT = 'YY') as tb
             ) as t2
       Left Join t1
             On t2.STUDENT = t1.STUDENT and t1.DATE >= t2.di and t1.DATE <= t2.df 
       Group By t2.STUDENT, t2.df 
       Order By t2.STUDENT, t2.df 
    

    In case anyone wants to play with the query, here goes a link to a Fiddle page.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search