skip to Main Content

I have a table of products and another table of specific listings of these products.

CREATE TABLE products (id INT AUTOINCREMENT);
CREATE TABLE listings (
  id INT AUTOINCREMENT,
  product INT REFERENCES products(id),
  vendor INT
)

I want to select listings for a set of products in such way that they are sold by minimal count of different vendors. As an example:

id product vendor
1 1 1
2 1 2
3 2 3
4 2 1
5 3 4

For products (1,2,3) I expect to get ids (1,4,5). Is there a way to achieve this just using SQL? Or should I simply use multiple queries and combine the results elsewhere?

2

Answers


  1. You can achieve this using SQL by writing a query that selects the listings for the specified products in a way that minimizes the count of different vendors. Here’s a SQL query that should give you the desired result:

    WITH RankedListings AS (
      SELECT
        l.id AS listing_id,
        l.product,
        l.vendor,
        ROW_NUMBER() OVER (PARTITION BY l.product ORDER BY COUNT(DISTINCT v2.vendor)) AS rn
      FROM listings l
      JOIN listings v2 ON l.product = v2.product
      WHERE l.product IN (1, 2, 3)
      GROUP BY l.id, l.product, l.vendor
    )
    SELECT listing_id
    FROM RankedListings
    WHERE rn = 1;
    
    

    In query, the common table expression (CTE) named RankedListings is utilized to rank listings by counting different vendors for the same product and assign a rank using the ROW_NUMBER() function. This CTE involves a self-join on the listings table to determine the number of unique vendors selling the same product. The results are filtered to include only specified products (1, 2, 3). In the main query, it selects listing IDs from the RankedListings CTE with a row number of 1, effectively identifying listings sold by the fewest different vendors for each product, achieving the desired outcome

    Login or Signup to reply.
  2. Use a recursive query to get all possible combinations. In your case IDs (1,3,5), (1,4,5), (2,3,5) and (2,4,5). Then check the distinct number of vendors per set, take the one with the lowest number of vendors and show the rows.

    Here are my steps:

    1. I number the products, so if I have products 10, 20 and 30, they get the indexes 1, 2 and 3, and so I can go from one product to the next by adding one to the index.
    2. I use a recursive CTE, to get all ID combinations. As this is recursive, I’ll get the complte series like (1,3,5), but also the incomplete ones (1), (1,3), etc. I remember the IDs and the vendors in arrays.
    3. I pick the rows with the largest arrays, i.e. the complete ones (1,3,5), (1,4,5), (2,3,5) and (2,4,5).
    4. I pick the one with the lowest number of distinct values. (In case of a tie, I’ll pick one ID set arbitrarily.)
    5. I select the rows for the IDs of my best set.

    The query:

    with recursive
      prep as
      (
        select l.*, dense_rank() over (order by product) as pidx
        from listings l
      ),
      series(ids, vendors, pidx) as
      (
        select
          array[id] as ids,
          array[vendor] as vendors,
          1
        from prep
        where pidx = 1
        union all
        select
          array_append(s.ids, p.id),
          array_append(s.vendors, p.vendor),
          s.pidx + 1
        from series s
        join prep p on p.pidx = s.pidx + 1
      ),
      complete as
      (
        select ids, vendors
        from series
        order by rank() over (order by cardinality(ids) desc)
        fetch first row with ties
      ),
      best as
      (
        select ids, vendors
        from complete
        order by (select cardinality(array_agg(distinct vendor))
                  from unnest(vendors) t(vendor))
        fetch first row only
      )
    select *
    from listings
    where id in (select unnest(ids) from best)
    order by id;
    

    Demo: https://dbfiddle.uk/uMqMs0xn

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search