Return the number of records where condition met within time intervals in MySQL

M225rtonMoln225r
May 1, 2023
232 views
0 votes
2 Answers

I am at the beginning of learning SQL, and this problem gives me a headache. I have a table where I want to work with these columns:

updated_date	upload_date
2023-04-03 09:10:50	2023-04-03 09:08:24
2023-04-03 09:14:55	2023-04-03 09:08:34
2023-04-03 09:23:51	2023-04-03 09:08:36
2023-04-03 09:25:53	2023-04-03 09:14:35
2023-04-03 09:28:43	2023-04-03 09:17:29
2023-04-03 10:29:06	2023-04-03 09:39:16
2023-04-03 10:53:15	2023-04-03 09:44:16
2023-04-03 10:24:40	2023-04-03 09:48:39

What I would need is the COUNT of records within 1 hour timeframe (let’s say from 2023-04-03 09:00:00 to 2023-04-03 10:00:00) from the "upload_date" column, but only those where the processing time (updated_date – upload_date) takes more than 1 hour, plus return the average processing time (in hh:mm:ss format) in that group. This should be one row, the next row should contain the COUNTs between 10:00-11:00 and so on.

I tried this query:

SELECT COUNT(*) AS number_of_feeds,
       AVG(TIMEDIFF(updated_date, upload_date)) AS average_processing_time
FROM   auk_feed_queue  
WHERE  TIMEDIFF(updated_date, upload_date) > '00:59:59'

But it returns all records from the table in one row, without the correct time format:

number_of_feeds	average_processing_time
1264	90274.3861

I would need multiple rows, each row contains 1 hour timeframe. How do I achieve that?

Answers

You can find all hourly intervals in subquery, and then join the original table back onto it:

select t.d1 hour_interval, 
     sec_to_time(avg(time_to_sec(timediff(u.updated_date, u.upload_date)))) avg_duration, 
     count(*) cnt 
from (select distinct date_format(u.upload_date, '%Y-%m-%d %H:00:00.000') d1 from uploads u) t 
join uploads u on t.d1 <= u.upload_date and u.upload_date <= t.d1 + interval 1 day
where time_to_sec(timediff(u.updated_date, u.upload_date)) / 3600 > 1
group by t.d1

See fiddle

Your sample data only has one result that has a duration greater than 1 hour. I added a second one.

CREATE TABLE data (updated_date DATETIME, upload_date DATETIME);
INSERT INTO data VALUES
('2023-04-03 09:10:50', '2023-04-03 09:08:24'),
('2023-04-03 09:14:55', '2023-04-03 09:08:34'),
('2023-04-03 09:23:51', '2023-04-03 09:08:36'),
('2023-04-03 09:25:53', '2023-04-03 09:14:35'),
('2023-04-03 09:28:43', '2023-04-03 09:17:29'),
('2023-04-03 10:29:06', '2023-04-03 09:39:16'),
('2023-04-03 10:53:15', '2023-04-03 09:44:16'),
('2023-04-03 11:53:15', '2023-04-03 10:27:16'), /*Added this row to create second result*/
('2023-04-03 10:24:40', '2023-04-03 09:48:39');

The query first gets the beginning of the hour of the upload date so we can group the results. Then count, get average, and set your 1 hour condition in the where.

SELECT DATE_FORMAT(upload_date, '%Y-%m-%d %H:00:00') upload_date_hr
     , COUNT(*) countOfDates
     , CAST(AVG(TIMEDIFF(updated_date, upload_date)) AS TIME) avgDuration
  FROM data d
 WHERE TIMEDIFF(updated_date, upload_date) >= '01:00:00'
 GROUP BY upload_date_hr

This returns:

upload_date_hr	countOfDates	avgDuration
2023-04-03 09:00:00	1	01:08:59
2023-04-03 10:00:00	1	01:25:59

FIDDLE

Please signup or login to give your own answer.

Click here to cancel reply.