I have this problem where I want to first select 8 elements from a mysql database ordering by id DESC.
Then I want to select another group of results (8 items), this time order by date DESC but the results here I want to ensure that they are not already on the fisrt query the one for ordering by id.
The data is in the same table just with different columns like id,name,date,.
So far I have tried writing different queries to get the data but the data contains some similar items of which that is what I don’t want.
Here are the queries I have written;
this returns 8 items sorted by id DESC
SELECT name FROM person order by id DESC LIMIT 8;
this returns 8 items also but sorted by date DESC
SELECT name FROM person order by date DESC LIMIT 8;
the returned data contain duplicate items!
3
Answers
The first query should return the primary key for the table. If
name
is the key then so be it, but probably thatid
field is the better choice.Then we can write the query like this:
We could also use an exclusion join which is usually slower, but in this case reduces one level of nesting so it might do better:
One other thing to keep in mind is MySQL is strict about what kinds of subquery can use the
LIMIT
keyword. Specifically, you need it to be a derived table. I know the exclusion join option should qualify, but I’m less sure of theNOT EXISTS()
option.You could use a nested query, first select the first 8 id’s, then select the first 8 records ordered by date, excluding those id’s:
Why not generate both resultsets with a single query? We can combine window functions, order by, and limit to generate a resultset containing the top 8 rows per id and the top 8 rows per date, while avoiding duplicates:
In the subquery, the window functions enumerate records by descending id and date. The outer query performs a conditional sort that puts the top 8 id first, and orders the rest of the records by descending date. All that is left to do is retain the top 16 results from the query. You don’t need to worry about duplicates since the table is scanned only once.
Here is a small test case:
For this sample data, and given a target of 3 + 3 records (instead of 8 + 8 in our code), the query returns:
Typically, id
7
, which has both the greatest id the second latest date, shows up in the first part of the resultset (the top 3 rows are sorted by descending id), but is not repeated in the second part.Demo on DB fiddle