I have a table with records in it. The desire was to find the records that were exactly the same, except for the time of the 2 records have a difference of less than 2 seconds.
I created a query that works, but as a join, I’d like to switch it to a group by so I can have the records listed vertically, rather than having to go across the entire row to see both records.
I found a solution using a join, but I really would like to see what a solution could be with a group by. Or any solution that shows both records vertically.
This is the query I have that works
SELECT p1.*, p2.*
FROM punches_history p1
INNER JOIN punches_history p2 ON
p1.ClockNumber = p2.ClockNumber AND
p1.PartNumber = p2.PartNumber AND
p1.Quantity = p2.Quantity AND
p1.YYMMDD = p2.YYMMDD AND
ABS(p1.TIMED - p2.TIMED) < 2
WHERE p1.ClockNumber != '10' AND p1.id = (p2.id + 1) AND p1.CardColor = 'BLU'
ORDER BY p1.YYMMDD desc, p1.ClockNumber desc
The WHERE stipulation that needs explaining is p1.id = (p2.id + 1)
the reason for this logic is so if records 20 and 21 are found records 21, and 20 aren’t also found.
2
Answers
Here is a way I think you can pivot this vertically. Take the id’s from the two columns and join back on the table using an
OR
.You might continue from there to group in some way if you wanted to.
As I read your question, you want to identify consecutive rows having a set of similar columns with a lesser than 2 seconds gaps – which suggest a gaps-and-island approach.
The idea is to compare the values of the related columns across rows using
lag()
to identify islands.Here is a query that works for "duplicates" in columns
ClockNumber
andPartNumber
:This brings one row for each group of (consecutive) duplicate rows, along with starting and ending
id
andtimed
, and count of duplicates per group.You would typically extend the query with more
lag()
s and more conditions in thécase
expression to add more columns in the duplicate detection logic (Quantity
and
YYMMDD
).Note: presumably
Timed
is always increasing, soabs()
is not needed.