skip to Main Content

my_events table stores event_id, event_date, event_type, event_duration.

There are many event types for example: Event A, Event B, Event C, Event D
The query that I have written will get my results but will exclude events that were not in my observation period event_date > ‘2023-05-01’

How do I get the results so that the grouping will show all Events in the first column?

This query will return only events that have occurred after ‘2023-05-01’
for example:

SELECT event_type,
COUNT(event_id) as num_events,
AVG(event_duration) as avg_duration
FROM my_events
WHERE event_date > '2023-05-01'
GROUP BY event_type
event_type num_events avg_duration
Event B 20 00:33:25
Event C 33 01:04:22

I want the results to show A and D even if they do not occur in my observation period:

event_type num_events avg_duration
Event A 0 00:00:00
Event B 20 00:33:25
Event C 33 01:04:22
Event D 0 00:00:00

Additional information based on responses:
The table is very large and I would like to stick to pre-defining a list of rows of events.
My next attempt was to create a Common Table Expression and left join my_events onto the CTE as shown

WITH event_list
     AS (SELECT 'Event A' AS event_type
         UNION ALL
         SELECT 'Event B' AS event_type
         UNION ALL
         SELECT 'Event C' AS event_type
         UNION ALL
         SELECT 'Event D' AS event_type)
SELECT el.event_type,
       Count(me.event_id)     AS num_events,
       Avg(me.event_duration) AS avg_duration
FROM   event_list el
       LEFT JOIN my_events me using (event_type)
WHERE  me.event_date > '2023-05-01'
GROUP  BY el.event_type

However, if I run the CTE alone it displays the rows for Events A-D, but after joining it with my_events I still get the same result of just B and C.

2

Answers


  1. You can select all distinct event_type‘s and then use subqueries to collect num_events and avg_duration:

    SELECT DISTINCT event_type, 
    (SELECT COUNT(*) FROM my_events WHERE event_type = events.event_type AND event_date > '2023-05-01') AS num_events, 
    (SELECT AVG(event_duration) FROM my_events WHERE event_type = events.event_type AND event_date > '2023-05-01') AS avg_duration 
    FROM my_events AS events;
    

    If you need to get 00:00:00 in avg_duration instead of NULL‘s, then use COALESCE function:

    SELECT DISTINCT event_type, 
    (SELECT COUNT(*) FROM my_events WHERE event_type = events.event_type AND event_date > '2023-05-01') AS num_event, 
    COALESCE((SELECT AVG(event_duration) FROM my_events WHERE event_type = events.event_type AND event_date > '2023-05-01'), '00:00:00') AS avg_duration 
    FROM my_events AS events;
    
    Login or Signup to reply.
  2. The following query returns the requested results:

    WITH event_types AS (
      SELECT DISTINCT event_type
        FROM my_events
    )
    SELECT event_types.event_type,
           COUNT(my_events.event_id) AS num_events,
           COALESCE(AVG(my_events.event_duration), INTERVAL '0' SECOND) AS avg_duration
      FROM event_types
      LEFT JOIN my_events
        ON event_types.event_type = my_events.event_type
          AND my_events.event_date > '2023-05-01'::DATE
     GROUP BY event_types.event_type;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search