skip to Main Content

Sample contents are:

id created_dt data
1 2023-01-14 11:52:41 {"customers": 1, "payments: 2}
2 2023-01-15 11:53:43 {"customers": 1, "payments: 2}
3 2023-01-18 11:51:45 {"customers": 1, "payments: 2}
4 2023-01-15 11:50:48 {"customers": 1, "payments: 2}

ID 4 or 2 should be distinct.

I want to get a result as follows:

year week customers payments
2023 2 2 4
2023 3 1 2

I solved this problem in this way

SELECT
    date_part('year', sq.created_dt) AS year,
    date_part('week', sq.created_dt) AS week,
    sum((sq.data->'customers')::int) AS customers,
    sum((sq.data->'payments')::int) AS payments
FROM 
    (SELECT DISTINCT ON (created_dt::date) created_dt, data 
     FROM analytics) sq
GROUP BY 
    year, week
ORDER BY 
    year, week;

However, that subquery greatly complicates the query. Is there is a better method?

I need group the data by each week, however I also need to remove duplicate days.

3

Answers


  1. Generate series to create the join table would solve the problem :

    SELECT sum((sq.data->'customers')::int) as customers,
    sum((sq.data->'payments')::int) as payments,
    date_part('year', dategroup ) as year,
    date_part('week', dategroup ) as week,
    FROM generate_series(current_date , current_date+interval '1 month' , interval'1 week') AS dategroup
    JOIN analytics AS a ON a.created_dt >= dategroup AND a.created_dt <= a.created_dt+interval '1 week'
    GROUP BY dategroup
    ORDER BY dategroup
    
    Login or Signup to reply.
  2. First of all, I think your query is quite simple and understandable.

    Here is the query with a with-query in it, in some point it adds more readabilty:

    WITH unique_days_data AS (
      SELECT DISTINCT created_dt::date, data_json
      FROM analytics)
    SELECT 
        date_part('year', ud.created_dt) as year,
        date_part('week', ud.created_dt) as week,
        sum((ud.data_json->'customers')::int) as customers,
        sum((ud.data_json->'payments')::int) as payments
    FROM unique_days_data ud
    GROUP BY year, week
    ORDER BY year, week;
    

    The difference is that the first query uses the DISTINCT clause, not the DISTINCT ON clause.

    Here is the sql fiddle.

    Login or Signup to reply.
  3. You can simplify it by adding partitioning on "created_id::date", then filter last aggregated record for each week using FETCH FIRST n ROWS WITH TIES.

    SELECT date_part('year', created_dt) AS year,
           date_part('week', created_dt) AS week,
           SUM((data->>'customers')::int) AS customers,
           SUM((data->>'payments')::int) AS payments
    FROM analytics
    GROUP BY year, week, created_dt::date
    ORDER BY ROW_NUMBER() OVER(
                 PARTITION BY date_part('week', created_dt) 
                 ORDER     BY created_dt::date DESC
             )
    FETCH FIRST 1 ROWS WITH TIES
    

    Check the demo here.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search