I have 3 tables: products, product_options and product_option_choices. Products can have multiple options, options can have multiple choices.
Choices can adjust the main product price (choice_price_adjust field).
What I’m trying to do is fetch the products and have a field returned on each product that is the maximum possible adjustment value and the minimum possible adjustment value so I know the range of prices this product can be sold at.
This is what I have so far but I seem unable to reference the prod_id field (or even an alias) of the products table within the subquery.
If I give a specific product id in place of pid
it works but obviously I want to subquery running based on each individual product.
I’m also open to suggestions on how to improve the whole query without needing to find a solution to this particular issue.
SELECT *, p.prod_id as pid,
(SELECT SUM(y.min) as min_total
FROM (
SELECT x.min as min
FROM (
SELECT product_option_choices.prod_op_id as op_id, MIN(choice_price_adjust) as min
FROM product_option_choices
WHERE prod_id = pid
GROUP BY product_option_choices.prod_op_id
) x
GROUP BY x.op_id
) y) as choices_min_price
FROM products p
—
EDIT to clarify data and expected results.
The database structure (simplified to only relevant test data and I haven’t added FKs) is as follows:
CREATE TABLE `products` (
`prod_id` int(9) UNSIGNED NOT NULL,
`product_name` varchar(255) DEFAULT NULL,
`price` int(16) UNSIGNED DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `products` (`prod_id`, `product_name`, `price`) VALUES
(1, 'Product 1', 1000),
(2, 'Product 2', 2000);
CREATE TABLE `product_options` (
`prod_op_id` int(9) UNSIGNED NOT NULL,
`prod_op_name` varchar(255) NOT NULL,
`prod_id` int(9) UNSIGNED NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `product_options` (`prod_op_id`, `prod_op_name`, `prod_id`) VALUES
(1, 'Color', 1),
(2, 'Size', 1),
(3, 'Grade', 1),
(4, 'Direction', 2),
(5, 'Smell', 2),
(6, 'Origin', 2);
CREATE TABLE `product_option_choices` (
`choice_id` int(16) UNSIGNED NOT NULL,
`prod_op_id` int(9) UNSIGNED NOT NULL,
`prod_id` int(9) UNSIGNED NOT NULL,
`choice_name` varchar(255) NOT NULL,
`choice_price_adjust` int(10) NOT NULL DEFAULT '0'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `product_option_choices` (`choice_id`, `prod_op_id`, `prod_id`, `choice_name`, `choice_price_adjust`) VALUES
(1, 1, 1, 'Purple', 100),
(2, 1, 1, 'Orange', -500),
(3, 2, 1, 'Small', -300),
(4, 2, 1, 'Large', 400),
(5, 3, 1, 'A', 900),
(6, 3, 1, 'B', 100),
(7, 4, 2, 'Up', 200),
(8, 4, 2, 'Down', -200),
(9, 5, 2, 'Vanilla', -150),
(10, 5, 2, 'Sandlewood', 0),
(11, 6, 2, 'Afica', 0),
(12, 6, 2, 'Norway', 700);
ALTER TABLE `products`
ADD PRIMARY KEY (`prod_id`);
ALTER TABLE `product_options`
ADD PRIMARY KEY (`prod_op_id`);
ALTER TABLE `product_option_choices`
ADD PRIMARY KEY (`choice_id`);
Each product has its own set of options. Each of those options has their own set of choices. In the front end, one choice per option can be selected by users (a selection for each option is required).
What I’m trying to have returned by the query is a row for each product with the product table data (product_name, price) but also a calculated column for both the maximum and minimum possible totals for ‘choice_price_adjust’.
For example, with product 1 the user has to select a value for Color, Size and Grade. They cannot select a value for Direction, Smell or Origin because those options are for product 2.
The user has a choice of: Purple and Orange for Color; Small and Large for Size; A and B for Grade. I only added 2 in each but there can be 1 or more, it is not always 2 choices.
So, for Product 1’s returned row I would want the query to look at the available values for Color (Purple and Orange). It then grabs the lowest value within those choices as the lowest_value (in this case, Orange is lower at -500 so that’s the lowest_value). It also grabs the highest value within those choices (100 in this case from Purple) and that’s the highest_value. Same with Size (in this case, -300 is lowest, 400 is highest). Same for Grade (100 is lowest, 900 is highest). All other choices and options are irrelevant because they do NOT have a prod_id value of 1 (Product 1).
I’d then like the total of the lowest_value numbers (as min_total) and the total of the highest_value numbers (as max_total) to be added as a column in the product row that gets returned along with prod_id, product_name and price for each product. In the above case, min_total would be -700 (-500 + -300 + 100). max_total would be 1,400 (100 + 400 + 900)
My latest attempt based on feedback was to add the subquery to the FROM and consider JOINs but it’s still not right as it is doing the SUM on ALL choices and not the subset I’m trying to pick for each product. I’m using the test data for this now:
SELECT p.prod_id, p.product_name, p.price, SUM(x.min_val) as min_total, SUM(x.max_val) as max_total
FROM products p, (
SELECT
poc.prod_id,
poc.choice_id,
MIN(poc.choice_price_adjust) as min_val,
MAX(poc.choice_price_adjust) as max_val
FROM product_option_choices poc
LEFT JOIN product_options po on(po.prod_op_id = poc.prod_op_id)
LEFT JOIN products p2 on(p2.prod_id = poc.prod_id)
GROUP BY poc.choice_id
) x
GROUP BY p.prod_id;
When I try to add WHERE p2.prod_id = p.prod_id
I’m told p.prod_id is an unknown column. I thought I could use this as it’s only within a single subquery now? Or is this because of the order in which subqueries run based on which part of the main query they are in?
2
Answers
With your sample data provided you should consider to do it step by step using two ctes …
Step 1.
… create a cte (grid) collecting all the products, options and choices data in the same row
Step 2.
… create second cte to get lowest and largest price adjustments per product and option
Step 3.
… main sql with all the data from above and calculated total lowest and largest price adjustments per product
Now you have the resultset for all the products/options/choices containing Min/Max values per product and option as well as total lowest and total largest price adjustments per product. So, if you want just one option – filter it in Main SQL’s WHERE clause:
This resultset can easily be filtered by user provided selection of product and option combination. The combinations could be offered to the users as a kind of list of legit values derived from the collected data (the 1st cte grid) or selecting the combinations directly from your tables. Here are the posible user selections from grid cte:
See the fiddle here (with results for all steps).
ALTERNATIVE ANSWER:
Since there were some missunderstandings about the expected result and since it seams that the result should depend on user provided selection of product id and/or option id here is the answer where the resultset and lowest/largest values depend on user’s selections.
Sample data and this code could be found and played with in the fiddle here.
This solution reflects the way that I see the problem and possible solution, The code could be furtheremore adjusted and/or fine tuned if I got something wrong. What it can’t do is to result with variable number of columns. The number of columns and their datatypes must be exact. If there is a need to get the result in some cases with 6 columns and in the other 4 (or 8) columns then there should be another SQL statement,