I am getting duplicate and sometimes unordered results with doing a LEFT OUTER JOIN.
I think the issue is that there are many artwork_colors that can match the WHERE condition. But I just want the artwork’s artwork_color with the highest pixel_percent (that also matches the WHERE clause), and to use that value for ordering the artworks. And of course for each artwork to appear only once in the results.
How do I rewrite the ActiveRecord query to order the results correctly while avoiding duplicates? (This query must be made with ActiveRecord so that I can plug it into an existing pagination system, etc.)
The query, simplified for this question:
# color_ids is an group of ids of colors that are similar to the colors being searched for
# for example if you search for FF0000 it will include similar shades of red
SELECT DISTINCT
"artwork_colors"."pixel_percent",
artworks.*
FROM
"artworks"
LEFT OUTER JOIN "artwork_colors" ON "artwork_colors"."artwork_id" = "artworks"."id"
WHERE
"artwork_colors"."color_id" IN(106, 108, 119, 120, 128, 133, 156, 160)
ORDER BY
"artwork_colors"."pixel_percent" DESC
LIMIT 120 OFFSET 0;
The original query in ActiveRecord:
artworks
.includes(:artwork_colors)
.where('artwork_colors.color_id': color_ids)
.order(pixel_percent: :desc)
.select('artworks.*', 'artwork_colors.pixel_percent')
.distinct
Relevant models and tables:
class Artwork < ApplicationRecord
has_many :artwork_colors, dependent: :destroy
has_many :colors, through: :artwork_colors
end
class ArtworkColor < ApplicationRecord
belongs_to :artwork
belongs_to :color
end
CREATE TABLE public.artwork_colors (
id bigint NOT NULL,
pixel_percent double precision, # this is the desired sort column
artwork_id bigint,
color_id bigint
);
class Color < ApplicationRecord
# These colors were extracted from the Artwork images by Amazon Rekognition, an image analysis tool
has_many :artwork_colors, dependent: :destroy
has_many :artworks, through: :artwork_colors
end
# h s l are hue, saturation, lightness (the color value)
CREATE TABLE public.colors (
id bigint NOT NULL,
h double precision,
s double precision,
l double precision
);
2
Answers
I used a little Ruby to kind of cheat and get the order I needed, and then returned an ActiveRecord collection that just loads an ordered list of the desired artwork ids. This might not work in all use cases, just be warned.
If I understand the goal correctly you are looking for each
Artwork
ordered by the highestartwork_colors.pixel_percent
.In order to accomplish this you will need a sub query to find the maximum pixel_percent for each artwork and then join that subquery to the artworks table.
This can be accomplished as follows:
This should produce the following query:
Since this is an Outer Join it is possible that "highest_pixel_percent" may be NULL. If you would prefer this be represented as 0 then you can change this
subquery[:highest_pixel_percent]
to:which will result in
ISNULL(artwork_colors.highest_pixel_percent,0) AS highest_pixel_percent