I have three tables in a MySQL database, and after joining two of them together, the COUNT()
function in my query no longer works properly, and returns inflated numbers instead.
Please excuse my messy code as it is my first time using MySQL.
This is the query I have written which works:
SELECT
promo.ID,
promo.Branch,
promo.Promoter,
COUNT(CASE WHEN sales.brand = 'TEST' AND sales.datesold BETWEEN '2023-01-01' AND '2023-12-31' AND promo.branch = sales.branch THEN 1 END) AS Actual,
FROM sales.promo
LEFT JOIN sales.sales ON promo.promoter = sales.promoter
GROUP BY promo.ID
ORDER BY promo.ID ASC;
The COUNT()
function here returns the correct number.
I have three tables: promo
, sales
and model
.
What I wanted to do is expand it so that it could also filter the model details. The idea is that using promo.promoter
, you can look up the sales data in sales.sales
by matching the sales with the promoter. Then, using the sales.model
column, you can find information about the relevant model in the model
table. However, when I do
SELECT
promo.ID,
promo.Branch,
promo.Promoter,
COUNT(CASE WHEN sales.brand = 'TEST' AND sales.datesold BETWEEN '2023-01-01' AND '2023-12-31' AND promo.branch = sales.branch THEN 1 END) AS Actual,
COUNT(CASE WHEN sales.brand = 'TEST' AND sales.datesold BETWEEN '2023-01-01' AND '2023-12-31' AND promo.branch = sales.branch AND model.SizeRange = 'Small' THEN 1 END) AS SmallActual
FROM sales.promo
LEFT JOIN sales.sales ON promo.promoter = sales.promoter
LEFT JOIN sales.model ON sales.model = model.model
GROUP BY promo.ID
ORDER BY promo.ID ASC;
it ends up inflating the values in the Actual and SmallActual column by a seemingly random number. How can I fix this?
Searching online, I’ve seen people talk about ‘table aggregation’ and whatnot to fix it, but I don’t understand how that is supposed to work. Does anyone know what’s going wrong here? Thanks!
==========
Edit: Here are my tables with some sample data:
Promoter table:
ID | Branch | Promoter |
---|---|---|
1 | Sample | John Smith |
Sales table:
ID | Promoter | Dealer | Branch | Brand | Model | DateSold |
---|---|---|---|---|---|---|
1 | John Smith | Challenger | Sample | TEST | C123 | 2023-02-25 |
Model Table:
ID | Model | Brand | SizeRange | SalesRangeCategory |
---|---|---|---|---|
1 | C123 | TEST | Small | Entry Level |
Expected output:
ID | Branch | Promoter | Actual | SmallActual |
---|---|---|---|---|
1 | Sample | John Smith | 91 | 26 |
Actual output:
ID | Branch | Promoter | Actual | SmallActual |
---|---|---|---|---|
1 | Sample | John Smith | 106 | 39 |
2
Answers
Firstly, you should use right formatting style. It makes your code readable and more comfortable for work.
What about group by operator. You can’t use columns in select block without using them in grouping and without some aggregation function.
In your task you can make subquery like this.
Do not forget table aliases. And if you want to add Branch, Promoter just use another join. Like this.
There must be something wrong with your data. The additional join should join the sale’s model. This cannot lead to a higher sales count, unless there are two rows in the models table for the same model.
You should fix your data model to ensure data integrity:
The sales table must not contain
promoter
, but thepromoter.id
. And it must not containmodel
, but themodel.id
. E.g.The same should apply for other entities, like dealers, branches and brands.
And make sure your tables contain the appropriate unique constraints:
This would look like this: