How do I tell PostgreSQL this? ::
Get me the highest-ranked result for every element in a group, ranked by these attributes that you can already find in an index, so you don’t have to rank the entire table.
See below a practical example of actually working SQL, but with poor performance because it will rank the entire table before giving me the answer.
db<>fiddle: https://dbfiddle.uk/qCp6nt1q
CREATE TABLE job_queue (
id SERIAL PRIMARY KEY,
job_type VARCHAR,
priority INT,
created_at TIMESTAMP WITHOUT TIME ZONE
-- Assume there are other columns here as well, specific
-- to the job object, but I'm ommitting them since they
-- are irrelevant to the problem at hand.
);
CREATE INDEX job_idx ON job_queue (job_type, priority, created_at);
INSERT
INTO job_queue (id, job_type, priority, created_at)
VALUES
(1, 'j1', 1, '2000-01-01 00:00:00'),
(2, 'j1', 1, '2000-01-02 00:00:00'),
(3, 'j1', 2, '2000-01-01 00:00:00'),
(4, 'j2', 1, '2000-01-01 00:00:00'),
(5, 'j2', 1, '2000-01-02 00:00:00'),
(6, 'j2', 2, '2000-01-01 00:00:00');
-- Give me the oldest highest-ranked job for each job type.
SELECT id FROM (
SELECT
id,
ROW_NUMBER() OVER (
PARTITION BY job_type
ORDER BY priority, created_at
) AS rank
FROM job_queue
) AS ranked_jobs
WHERE rank = 1;
What is a query that is equivalent to the last example in the SQL above, but that won’t rank the entire table before giving me the result? I think that should be possible because the columns can be found in an index, so, in theory, the database just needs to get the first tuple in the index for each job_type
and get that to me, instead of ranking the entire table.
It would be equivalent to:
SELECT id FROM job_queue WHERE job_type = 'j1' ORDER BY priority, created_at LIMIT 1;
SELECT id FROM job_queue WHERE job_type = 'j2' ORDER BY priority, created_at LIMIT 1;
But one single query that would work for as many job types as I would have, and that returns one single table with all the ids that I care about.
3
Answers
This is an other option using multiple CTEs :
Demo here
You could use a recursive CTE, essentially mimicking a skip-scan.
db<>fiddle
If you have a table listing each job_type, you could use a lateral join:
If not, you could use a very ugly recursive CTE to simulate an index skip scan,