Mysql - How to Efficiently Find Consecutive Rows with Matching Criteria in SQL?

walkingbas
November 15, 2024
167 views
1 vote
2 Answers

I have a dataset that tracks user activity, and I need to identify consecutive rows where a specific condition is met. The table structure and sample data are as follows:

Table: UserActivity

UserID	Activity	Timestamp
1	Login	2023-11-01 08:00:00
1	PageView	2023-11-01 08:10:00
1	Login	2023-11-01 08:15:00
1	Logout	2023-11-01 08:20:00
2	Login	2023-11-01 09:00:00
2	Login	2023-11-01 09:05:00
2	PageView	2023-11-01 09:10:00

I want to identify cases where consecutive rows for the same user have the same Activity, along with their timestamps. For example:

For UserID = 2, there are two consecutive Login activities. Like this.

Expected Output

UserID	Activity	StartTimestamp	EndTimestamp
2	Login	2023-11-01 09:00:00	2023-11-01 09:05:00

I have attempted using a self-join to compare each row with the previous one, but it becomes computationally expensive for large datasets. I’ve also tried using LAG and LEAD functions, but I’m struggling to group consecutive rows efficiently.

Answers

- BartMcEndree
- November 15, 2024 at 7:42 pm
- 0 votes
0
Here is a possible solution using LAG. This solution assumes no more then 2 duplicates per UserID+Activity in a row.
```
WITH CTE AS
(
SELECT UserID, Activity, 
        LAG(USERID) OVER (ORDER BY TimeStamp)  as PrevID, 
        LAG(Activity) OVER (ORDER BY TimeStamp)  as PrevActivity, 
        LAG(TimeStamp) OVER (ORDER BY TimeStamp)  as StartTimeStamp, 
        TimeStamp as EndTimeStamp
FROM Example
ORDER BY TimeStamp
)
SELECT UserID, Activity, StartTimeStamp, EndTimeStamp
FROM CTE
WHERE UserID=PrevID AND Activity=PrevActivity
ORDER BY StartTimeStamp
```
fiddle

UserID Activity StartTimeStamp EndTimeStamp

2 Login 2023-11-01 09:00:00 2023-11-01 09:05:00
Login or Signup to reply.

- BartMcEndree
- November 15, 2024 at 8:17 pm
- 0 votes
0
Alternate solution adapting strategy from stackexchange user tombom
https://dba.stackexchange.com/questions/188775/selecting-consecutive-rows-with-same-values

This solution handles 2+ duplicates per UserID+Activity in a row.
```
SELECT userID, 
       Activity, 
       MIN(Timestamp) as starttime,
       MAX(Timestamp) as endtime
FROM (
    SELECT
    t.*
    , @groupNumber := IF(@prev_userID != userID OR @prev_Activity != Activity, @groupNumber + 1, @groupNumber) AS gn
    , @prev_userID := userID
    , @prev_Activity := Activity
    FROM Example t
    , (SELECT @groupNumber := 0, @prev_userID := NULL, @prev_Activity := NULL) var_init_subquery
    ORDER BY userID, Timestamp
) sq
GROUP BY gn, userID, Activity
HAVING Count(*) > 1 
```
fiddle
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Mysql – How to Efficiently Find Consecutive Rows with Matching Criteria in SQL?

Table: UserActivity

Expected Output

Answers