I have a very flat table that is a collection of swimmer names, distances, events (strokes), and times(float). What I would like is to get the only fastest time for each combination of distance and stroke. To be clear, there are multiple distances and multiple strokes (50 free, 100 free, 200 free, 50 back, 100 back .. and so on)
The data looks like the following where the "bold" names would be the records I want returned.
NAME | DISTANCE | STROKE_ID | TIME |
---|---|---|---|
Susane Smith | 50 | 1 | 74.33 |
Cassey Putnam | 50 | 1 | 107.24 |
Kelly Maine | 50 | 1 | 118.08 |
Susane Smith | 100 | 1 | 168.4 |
Charlotte Castle | 200 | 1 | 476.46 |
Susane Smith | 50 | 2 | 83.23 |
Charlotte Castle | 50 | 2 | 104.83 |
Katy Cooper | 50 | 2 | 167.49 |
Susane Smith | 100 | 2 | 174.93 |
Charlotte Castle | 100 | 2 | 219.29 |
JoAnne Cooper | 100 | 2 | 355.74 |
Maddy Franks | 200 | 2 | 377.15 |
JoAnne Cooper | 200 | 2 | 733.77 |
Charlotte Castle | 50 | 3 | 138.5 |
Charlotte Castle | 100 | 3 | 349.2 |
Charlotte Castle | 200 | 3 | 676.12 |
2
Answers
I created a table exactly like yours, entered the data and fooled around and got to the following query:
Lets break this down step by step:
The query exists of two parts, the outer query and the inner query. First we’ll look at the inner query.
This query adds an extra column to the data called ‘
rn
‘. This column is based on theDISTANCE
paired with theSTROKE_ID
. Each different combination of the two gets a new number. So the combinations might look like this:etc. It then sorts these results ascending by time. Mind that the data I showed above is not the actual newly created table, it just goes to show what the ‘
rn
‘ column does. Then this new ‘table’ gets the aliasSUB
(short for subquery).Then the outer query:
The outer query selects the NAME, DISTANCE, STROKE_ID and TIME from
SUB
where rn = 1. We choose 1 for rn because this is the row with the fastest time.Note that the word
SUB
is used once in the total query, and twice in my explanation. This is only so it reads more easily, the uppermost(total) query is what you would actually type in your terminal.Hope this is what you meant.
[EDIT]
This was my output by the way:
DISTINCT ON
makes this very simple:Detailed explanation:
Subtle detail: you seem to want to order by
stroke_id
first, hence the sort order inORDER BY
.