I am trying to get a table of an average for a column, across multiple rows (if that makes sense).
For example, in the table below, I need to group the postcodes (as there can be more than 1 suburb to a postcode), but also group the property and rooms, to return an average across those particular rows.
Sample table:
postcode | suburb | property | rooms | price |
---|---|---|---|---|
0001 | A | apartment | 1 | 100,000 |
0001 | A | apartment | 2 | 200,000 |
0001 | A | house | 1 | 100,000 |
0001 | A | house | 2 | 200,000 |
0001 | A | house | 3 | 300,000 |
0001 | B | house | 1 | 50,000 |
0001 | B | apartment | 2 | 150,000 |
Desired outcome:
property | rooms | avg_price |
---|---|---|
apartment | 1 | 100,000 |
apartment | 2 | 175,000 |
house | 1 | 75,000 |
house | 2 | 200,000 |
house | 3 | 300,000 |
So I have averaged apartments with 2 rooms across the 2 suburbs, but within the 1 postcode. Likewise for 1 bedroom houses.
My attempt is below. Without the group by
at the end, it returns duplicate rows, with the group by
it returns 1 row.
"SELECT `property`, `rooms`, ROUND(AVG(`price`)
OVER (PARTITION BY `property`,`rooms`),0)
AS avg_price FROM `suburbs` WHERE `postcode` = '0001' GROUP BY `postcode`
I suspect I need to nest the postcode
first, then do the partition over
, but not working. Can someone please advise. Thanks in advance.
4
Answers
Can you try this :
You can remove the duplicates using "Distinct" Key Word
Your request is not clear. How does the postcode come into play? Your sample table shows data for one postcode only, so what would the result look like for two or more postcodes?
Anyway, it seems you have not really understood what
GROUP BY
does.GROUP BY x, y, z
means "I want one result row per x, y, z combination". You group by postcode, so you get one result row per postcode. As you also limit your data to postcode ‘0001’, you get one result row of course.You also confuse aggregation functions where you condense your data to less rows and show the average (
AVG
), sum (SUM
), whatever per group, with analytic functions, where you keep the original rows and add an aggregation gained from the result data set, for which you useAVG OVER
,SUM OVER
etc.You want aggregation. You don’t want to show the original rows from the table, but group rows per property and rooms.
Now to the postcode. What do you want to do? Do you want to select data for postcode ‘0001’ only and show one result row per property and rooms? Then
Or do you want to show this for all postcodes separately? Then you must select the postcode along and put it in your
GROUP BY
clause, so as to get one result row per postcode, property and rooms.I should add that your query is invalid as can be seen here: https://dbfiddle.uk/h36YNfh3. You group by postcode, but you select property and rooms. There can be more than one property and rooms for a postcode, so which would you want to select? You say that you can run your query and get one result row. This indicates that you are running MySQL in a cheat mode that allows you to write invalid
GROUP BY
queries. It is all the more surprising that your version suppports window function (aka analytic functions). If I remember correctly, window functions got introduced in MySQL 8, and the cheat mode that was the default in earlier versions was no longer the default then. Have you deliberately removed the defaultONLY_FULL_GROUP_BY
from the modes? Never do this. Always haveONLY_FULL_GROUP_BY
switched on when working with MySQL, no matter which version. Otherwise the DBMS will let you write invalid queries as in your example and not warn you about it.