How to order by timestamp and group based on score
- Entities starting with a prefix (say a1 are similar), but we need to
pickup the one with the highest score, in this casea1-1
- For a2 it should pickup
a2-1
, since it has a score of 0.6 - For a3, it should be
a3-0
- For a4, it should be
a4-1
, since its recent - Assume time is mentioned based on creation time ascending (i.e t1, t2 …)
entities table
contains the following
1. a1-0, x0, 0.8, t1
2. a2-0, x1, 0.5, t2
3. a2-1, x2, 0.6, t3
4. a3-0, x3, 0.8, t4
5. a2-2, x4, 0.7, t5
6. a1-1, x5, 0.9, t6
7. a4-0, x6, 0.6, t7
8. a4-1, x7, 0.6, t8
9. a1-2, x8, 0.85, t9
This is the output I expect, what SQL query should I use
- a4-1
- a3-0
- a2-1
- a1-1
2
Answers
User Postgresql’s brilliant
distinct on
with custom order.Pls. note that ‘a2-2’ has higher score than ‘a2-1’.
Live demo here
From the doc:
You can combine that with string functions to establish sets based on the prefix. Demo:
(string_to_array(code,'-'))
cuts up the code at each-
producing an array, then[1]
takes its first element, which will be the prefix.ORDER BY
dictates that the prefix is primarily responsible for the position. You wanted highest prefix first so you need DESCending order.DISTINCT ON
will discard all but one record for each prefix.ORDER BY
expression isscore DESC
.a4-1
overa4-0
based on the more recent creation time, you needcreation_time DESC
.It’s good to normalise your structure. You’re holding multiple pieces of information as a single
code
, which forces you to use string functions in order to separate them and process individually:a2-0
as an example, put the primary group identifiera
into one column, secondary2
into another, order number0
into another. You might prefer to just split into prefix and suffix, but the extra granularity might be worth it if you consider that text sortsa800
ahead, as a lower value thana9
(demo).