skip to Main Content

I have a query that groups data by week. I would like to include a partial week at the start and end of the result but the partial week does not count the records.

SELECT
    series_interval,
    coalesce(conversion_count, 0) AS conversion_count
FROM (
      SELECT
        series::date AS series_interval
      FROM
        generate_series(
          date_trunc('week', to_date('2024-12-10', 'YYYY-MM-DD')), -- Added 6 days to start date to get following week
          date_trunc('week', to_date('2024-12-23', 'YYYY-MM-DD')),
          INTERVAL '1 week'
        ) AS series
      UNION
      SELECT to_date('2024-12-04', 'YYYY-MM-DD')
    ) AS series -- Start date (middle of week)
    LEFT JOIN (
    SELECT
        date_trunc('week', "conversions"."created_at") AS agg_interval,
        count("conversions"."id") AS conversion_count
    FROM
        "conversions"
        INNER JOIN "commissions" ON "commissions"."conversion_id" = "conversions"."id"
    WHERE
        "conversions"."created_at" >= '2024-12-02' 
             AND "conversions"."created_at" <= '2024-12-23 23:59:59.999999'
    GROUP BY
        agg_interval
) c ON c."agg_interval" = series."series_interval";

The date buckets are:

4th => 8th (5 days)
9th => 15th (7 days)
16th => 22nd (7 days)
23rd => 23rd (1 day – restricted by WHERE so final partial week is what I want)

And the query result is:

series_interval conversion_count
2024-12-04 0
2024-12-09 411
2024-12-16 368
2024-12-23 32

I understand why I get 0 for the first row 0 – because 2024-12-04 is not a Monday and this is how the date truncation / grouping works.

How can I get the count for 4th => 8th?

2

Answers


  1. For simplicity, I’l use CTE params to set start_date and end_date for query. Further, the literal constants are not used in the query.
    Source data for example see in fiddle
    .

    Generate series (report frame) include start_date, start date of weeks between start_date and end_date.

    with params as(
      select '2024-12-04 00:00:00.00000'::timestamp start_date
         ,'2024-12-23 23:59:59.999999'::timestamp end_date
    )
    SELECT start_date report_date ,date_trunc('week',start_date) series_interval
    from params p
    
    UNION
    SELECT series::date AS report_date,series::date AS series_interval
    FROM params p
    cross join generate_series(
              date_trunc('week',start_date+interval '6 day'), -- Added 6 days to start date to get following week
              date_trunc('week', end_date),
              INTERVAL '1 week'
            ) AS series
    
    report_date series_interval
    2024-12-04 00:00:00 2024-12-02 00:00:00
    2024-12-16 00:00:00 2024-12-16 00:00:00
    2024-12-09 00:00:00 2024-12-09 00:00:00
    2024-12-23 00:00:00 2024-12-23 00:00:00

    Column report_date is used as output, series_interval – for join with grouped data.
    Extra dates are cut off by the filter WHERE created_at between start_date and end_date.

    with params as(
      select '2024-12-04 00:00:00.00000'::timestamp start_date
         ,'2024-12-23 23:59:59.999999'::timestamp end_date
    )
    SELECT report_date::date, series_interval,
        coalesce(conversion_count, 0) AS conversion_count
    FROM (
        SELECT start_date report_date ,date_trunc('week',start_date) series_interval
        from params p
        UNION
        SELECT series::date AS report_date,series::date AS series_interval
        FROM params p
        cross join generate_series(
              date_trunc('week',start_date+interval '6 day'), -- Added 6 days to start date to get following week
              date_trunc('week', end_date),
              INTERVAL '1 week'
            ) AS series
      ) AS series -- Start date (middle of week)
        LEFT JOIN (
          SELECT
            date_trunc('week', "conversions"."created_at") AS agg_interval,
            count("conversions"."id") AS conversion_count
          FROM "conversions"
          cross join params p
    --        INNER JOIN "commissions" ON "commissions"."conversion_id" = "conversions"."id"
          WHERE "conversions"."created_at" between start_date and end_date
          GROUP BY agg_interval
     ) c ON c."agg_interval" = series."series_interval"
     order by report_date;
    
    report_date series_interval conversion_count
    2024-12-04 2024-12-02 00:00:00 5
    2024-12-09 2024-12-09 00:00:00 7
    2024-12-16 2024-12-16 00:00:00 7
    2024-12-23 2024-12-23 00:00:00 1

    fiddle

    If you want first data bucket would aggregated on the full week (starting on 2nd Dec) – (It’s not clear why) use where filter

    WHERE "conversions"."created_at" between date_trunc('week',start_date) and end_date
    
    Login or Signup to reply.
  2. You have two non-standard week definitions: 1 starting other the Monday and 1 not being a full week. This complicates the matter somewhat but not excessively. The following uses a recursive CTE to generate the base periods beginning with your non-standard starting week and following periods each ending on the following each Monday and beginning the next period (it does however not deal with the short week). Once the periods are defined a second CTE handles the short week while converting each to a daterange. The main query then just counts the rows falling into each date range. See demo here.

    with recursive 
         intvl(sdate,edate,mdate) as
              ( select '2024-12-04'::date                   -- initial date 
                     , next_day('2024-12-04 '::date, 'Mon') -- to Monday following
                     , '2024-12-24'::date                   -- maximun date to process
                union all 
                select edate                                -- beginning of period
                     , next_day(edate,'Mon')                -- to next Monday
                     , mdate                                -- keeping Max date
                  from intvl
                 where sdate < '2024-12-23'::date 
              ) -- select * from intvl;
        , wk_range(agg_interval) as
             (select daterange(sdate, least(edate,mdate), '[)')
                from intvl
             ) -- select * from wk_range;
    select lower(wk.agg_interval)  "Series Interval" 
         , count(j.created_dt)     "Conversion Count" 
      from wk_range wk  
      left join ( select created_at::date created_dt
                       , c.id id, cm.conversion_id jid
                    from conversions  c
                    join commissions  cm
                      on cm.conversion_id = c.id
                      order by 1
                ) j
             on j.created_dt <@ wk.agg_interval
          group by lower(wk.agg_interval)
          order by lower(wk.agg_interval);
    

    NOTE: The above makes use of a next_day() function. Oracle provides a function to do this. I found it so useful that I wrote a Postgres version to do the same. It is provided in the demo.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search