skip to Main Content

How do I tell PostgreSQL this? ::

Get me the highest-ranked result for every element in a group, ranked by these attributes that you can already find in an index, so you don’t have to rank the entire table.

See below a practical example of actually working SQL, but with poor performance because it will rank the entire table before giving me the answer.

db<>fiddle: https://dbfiddle.uk/qCp6nt1q

CREATE TABLE job_queue (
  id SERIAL PRIMARY KEY,
  job_type VARCHAR,
  priority INT,
  created_at TIMESTAMP WITHOUT TIME ZONE
  -- Assume there are other columns here as well, specific
  -- to the job object, but I'm ommitting them since they
  -- are irrelevant to the problem at hand.
);

CREATE INDEX job_idx ON job_queue (job_type, priority, created_at);

INSERT
  INTO job_queue (id, job_type, priority, created_at) 
  VALUES 
    (1, 'j1', 1, '2000-01-01 00:00:00'),
    (2, 'j1', 1, '2000-01-02 00:00:00'),
    (3, 'j1', 2, '2000-01-01 00:00:00'),
    (4, 'j2', 1, '2000-01-01 00:00:00'),
    (5, 'j2', 1, '2000-01-02 00:00:00'),
    (6, 'j2', 2, '2000-01-01 00:00:00');


-- Give me the oldest highest-ranked job for each job type.
SELECT id FROM (
  SELECT
    id,
    ROW_NUMBER() OVER (
      PARTITION BY job_type 
      ORDER BY priority, created_at
    ) AS rank
  FROM job_queue
) AS ranked_jobs
WHERE rank = 1;

What is a query that is equivalent to the last example in the SQL above, but that won’t rank the entire table before giving me the result? I think that should be possible because the columns can be found in an index, so, in theory, the database just needs to get the first tuple in the index for each job_type and get that to me, instead of ranking the entire table.

It would be equivalent to:

SELECT id FROM job_queue WHERE job_type = 'j1' ORDER BY priority, created_at LIMIT 1;
SELECT id FROM job_queue WHERE job_type = 'j2' ORDER BY priority, created_at LIMIT 1;

But one single query that would work for as many job types as I would have, and that returns one single table with all the ids that I care about.

3

Answers


  1. What is a query that is equivalent to the last example in the SQL
    above, but that won’t rank the entire table before giving me the
    result?

    This is an other option using multiple CTEs :

    with cte1 as (
      select job_type, min(priority) as min_priority
      from job_queue
      group by job_type
    ),
    cte2 as (
      select j.job_type, min(priority) as min_priority, min(created_at) as min_created_at
      from job_queue j
      inner join cte1 c on c.job_type = j.job_type 
                    and c.min_priority = j.priority
      group by j.job_type
    )
    select j.*
    from job_queue j
    inner join cte2 c on c.job_type = j.job_type 
                      and c.min_created_at = j.created_at
                      and c.min_priority = j.priority;
    

    Demo here

    Login or Signup to reply.
  2. You could use a recursive CTE, essentially mimicking a skip-scan.

    WITH RECURSIVE cte AS (
        SELECT *
        FROM (
            SELECT
              jq.id,
              jq.job_type,
              jq.priority,
              jq.created_at
            FROM job_queue jq
            ORDER BY job_type, priority, created_at
            LIMIT 1
        ) jq
    
        UNION ALL
      
        SELECT
          jq.id,
          jq.job_type,
          jq.priority,
          jq.created_at
        FROM cte
        CROSS JOIN LATERAL (
            SELECT *
            FROM job_queue jq
            WHERE cte.job_type < jq.job_type
            ORDER BY job_type, priority, created_at
            LIMIT 1
        ) jq
    )
    SELECT *
    FROM cte;
    

    db<>fiddle

    Login or Signup to reply.
  3. If you have a table listing each job_type, you could use a lateral join:

    select id from job_type cross join lateral 
        (select id from job_queue where job_queue.job_type=job_type.job_type order by priority, created_at limit 1);
    

    If not, you could use a very ugly recursive CTE to simulate an index skip scan,

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search