Exclude rows from MySQL table where timestamp is less than in other row

Michael
May 31, 2024
182 views
2 votes
3 Answers

I have a MySQL table having "datetime" columns begintime and endtime:

+---------------------+---------------------+
| begintime           | endtime             |
+---------------------+---------------------+
| 2024-05-22 10:13:23 | 2024-05-31 13:37:34 |
| 2024-05-30 17:03:21 | 2024-05-31 16:01:25 |
| 2024-05-30 17:03:21 | 2024-05-31 16:01:25 |
| 2024-05-30 17:03:21 | 2024-05-31 16:01:25 |
| 2024-05-31 15:00:00 | 2024-05-31 15:00:03 |
| 2024-05-31 15:01:32 | 2024-05-31 16:01:26 |
+---------------------+---------------------+

This table contains the rows where begintime is the same as in some row and endtime is less than in that row. For example:

| 2024-05-22 10:13:23 | 2024-05-31 12:02:18 |

Here begintime is the same as in the first row and endtime is less than in that row.

How can I filter these rows out of the table using MySQL or maybe Python’s pandas?

Answers

- VvdL
- May 31, 2024 at 11:26 am
- 0 votes
0
The easiest way would be to GROUP BY the begintime and getting the maximum endtime:
```
SELECT begintime, MAX(endtime) AS endtime
FROM your_table
GROUP BY begintime
```
Login or Signup to reply.

- OishikSinha
- May 31, 2024 at 11:29 am
- 0 votes
0
You can do it by simply using a self-join in MySQL
```
SELECT table1.*
FROM your_table table1
JOIN your_table table2
ON table1.begintime = table2.begintime AND table1.endtime < table2.endtime;
```
Login or Signup to reply.

you can use a self-join in SQL. Here’s how you can do it:

DELETE t1
FROM your_table t1
INNER JOIN your_table t2 ON t1.begintime = t2.begintime
WHERE t1.endtime < t2.endtime;

If you want to achieve the same using Python’s pandas:

import pandas as pd

# Assuming you have fetched your data into a DataFrame named df
# Convert 'begintime' and 'endtime' columns to datetime dtype if they are not already
df['begintime'] = pd.to_datetime(df['begintime'])
df['endtime'] = pd.to_datetime(df['endtime'])

# Sort the DataFrame by 'begintime' and 'endtime' columns
df.sort_values(by=['begintime', 'endtime'], inplace=True)

# Group by 'begintime' and filter out rows where 'endtime' is less than the maximum 'endtime' for each group
filtered_df = df.groupby('begintime').apply(lambda x: x[x['endtime'] == x['endtime'].max()])

# Reset index to remove the groupby index
filtered_df.reset_index(drop=True, inplace=True)

Please signup or login to give your own answer.

Click here to cancel reply.