skip to Main Content

I have a table in Postgres

id title …..
1 title1 …..
2 title2 …..
3 title1 …..
4 title13 …..
5 title4 …..
6 title7 …..
7 title1 …..
8 title1 …..
9 title53 …..
10 title11 …..
11 title1 …..

and I have a Golang app that uses a GORM as ORM

I need to select columns id and title, where title should be distinct, and it should have key-set pagination by id. For example, if page has side 3, I should select

1 – title1
2 – title2
4 – title13

2d page should be

5 – title4
6 – title7
9 – title53

Any thoughts are welcome, even row sql

2

Answers


  1. You can first select all unique titles as a set, then get their id from your table. For example:

    WITH titles AS (
        SELECT DISTINCT title, id FROM my_table
    )
    
    SELECT my_table.id, my_table.title FROM my_table
    JOIN titles ON titles.id = my_table.id
    ORDER BY id -- or by other column
    LIMIT 3
    OFFSET 3
    

    EDIT: if you have large amount of data in this table you can further optimize the query to limit and skip unique titles first then get their id:

    WITH titles AS (
        SELECT DISTINCT title, id FROM my_table
        ORDER BY id
        LIMIT 3
        OFFSET 3
    )
    
    SELECT my_table.id, my_table.title, .... FROM my_table
    JOIN titles ON titles.id = my_table.id
    
    Login or Signup to reply.
  2. Assuming id is UNIQUE NOT NULL, this achieves your objective:

    WITH cte AS (
       SELECT DISTINCT ON (title)
              title, id
       FROM   tbl
       ORDER  BY title, id
       )
    SELECT *
    FROM   cte
    ORDER  BY id
    LIMIT  3
    OFFSET 3;
    

    fiddle

    Notably, DISTINCT ON cannot be combined with an ORDER BY that disagrees on the same query level. See:

    But the intermediate step with DISTINCT ON disables direct index support. An index with leading title field may still be used, but the whole table (or just index in an index-only scan) has to be processed for every query. Not actually "key-set pagination".

    If at all possible, create a MATERIALIZED VIEW with an index on id. That is, you can afford to reuse the generated result for enough time before having to refresh.

    CREATE MATERIALIZED VIEW mv_tbl AS
    SELECT DISTINCT ON (title)
           title, id
    FROM   tbl
    ORDER  BY title, id;
    
    CREATE UNIQUE INDEX mv_tbl_uniq ON mv_tbl (id) INCLUDE (title);
    

    The optional INCLUDE clause enables index-only scans (after running VACUUM, or VACUUM ANALYZE on it, unless autovacuum has kicked in.)

    Depending on data distribution the DISTINCT ON query might be optimized further. See:

    The query now is just:

    SELECT *
    FROM   mv_tbl
    ORDER  BY id
    LIMIT  3
    OFFSET 3;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search