I want to ‘group by’ beers so that they are grouped together, with the respect aggregate rating and tasters (people who review the beers) listed in separate columns.
Here is my code:
create or replace view tasters_avg_ratings1
as
select a.taster as taster, a.beer as beer, round(avg(a.rating),1) as rating
from allratings a
group by beer, taster
;
Yet my output looks like this:
beers=# select * from tasters_avg_ratings1;
taster | beer | rating
--------+------------------------+--------
Peter | XXXX | 5.0
Sarah | James Squire Pilsener | 3.0
Raghu | Sierra Nevada Pale Ale | 3.0
Hector | Fosters | 3.0
John | Chimay Red | 3.0
John | Sierra Nevada Pale Ale | 5.0
Geoff | James Squire Pilsener | 4.0
Ramez | Sierra Nevada Pale Ale | 4.0
John | 80/- | 4.0
John | Rasputin | 4.0
Adam | Old | 4.0
John | Crown Lager | 2.0
Jeff | Sierra Nevada Pale Ale | 4.0
Sarah | Burragorang Bock | 4.0
Sarah | Scharer's Lager | 3.0
Sarah | New | 2.0
Geoff | Redback | 4.0
Adam | Victoria Bitter | 1.0
Sarah | Victoria Bitter | 1.0
Raghu | Rasputin | 3.0
Ramez | Bigfoot Barley Wine | 3.0
Hector | Sierra Nevada Pale Ale | 4.0
Sarah | Old | 3.0
Jeff | Burragorang Bock | 3.0
John | Empire | 3.0
Sarah | James Squire Amber Ale | 3.0
Rose | Redback | 5.0
Geoff | Empire | 3.0
Adam | New | 1.0
Jeff | Rasputin | 1.0
Raghu | Old Tire | 5.0
John | Victoria Bitter | 1.0
(32 rows)
As you can see, the beers are NOT grouped together. Ideally for example, the ‘Victoria Bitter’ beers should be displayed as a group, not separated.
The desired result is achieved using ‘order by’. For example:
create or replace view tasters_avg_ratings1
as
select a.taster as taster, a.beer as beer, round(avg(a.rating),1) as rating
from allratings a
group by beer, taster
order by a.beer
;
OUTPUT:
beers=# select * from tasters_avg_ratings1;
taster | beer | rating
--------+------------------------+--------
John | 80/- | 4.0
Ramez | Bigfoot Barley Wine | 3.0
Jeff | Burragorang Bock | 3.0
Sarah | Burragorang Bock | 4.0
John | Chimay Red | 3.0
John | Crown Lager | 2.0
Geoff | Empire | 3.0
John | Empire | 3.0
Hector | Fosters | 3.0
Sarah | James Squire Amber Ale | 3.0
Geoff | James Squire Pilsener | 4.0
Sarah | James Squire Pilsener | 3.0
Adam | New | 1.0
Sarah | New | 2.0
Adam | Old | 4.0
Sarah | Old | 3.0
Raghu | Old Tire | 5.0
Jeff | Rasputin | 1.0
John | Rasputin | 4.0
Raghu | Rasputin | 3.0
Geoff | Redback | 4.0
Rose | Redback | 5.0
Sarah | Scharer's Lager | 3.0
Hector | Sierra Nevada Pale Ale | 4.0
Jeff | Sierra Nevada Pale Ale | 4.0
John | Sierra Nevada Pale Ale | 5.0
Raghu | Sierra Nevada Pale Ale | 3.0
Ramez | Sierra Nevada Pale Ale | 4.0
Adam | Victoria Bitter | 1.0
John | Victoria Bitter | 1.0
Sarah | Victoria Bitter | 1.0
Peter | XXXX | 5.0
(32 rows)
So while I know order by achieves my result, why doesn’t ‘group by’ do the same thing? It is frustrating because I have seen numerous examples on the internet using ‘group by’ and succeeding especially in cases similar to mine with non-aggregate and aggregate columns. For example: https://learnsql.com/blog/error-with-group-by/ , at tip #3.
Any help would be appreciated, thanks!
3
Answers
GROUP BY is only being used to compute the aggregate value (average rating in this case). It doesn’t have anything to do with the ordering of the results when they are displayed. As you have mentioned, you need to use ORDER BY to get the desired ordering.
Group By should be used to group rows that have the same value for the specified column,
Full explanation here https://stackoverflow.com/a/2421441.
In your case, you want to group beers with the respect to aggregate rating and tasters, so you need to
GROUP BY a.beers
and order by the first and the third column.Thus the view should be like this:
1 and 3 are the ordinal positions of columns that appear in the select list.
The SQL standard defines that a
SELECT
statement without anORDER BY
may return the resulting rows in an arbitrary order. There is a good explanation of why it might look different at first.GROUP BY
is for grouping and aggregating related tuples together. A trivial implementation for grouping is of course sorting the data first and then working your way from top to bottom and aggregate related tuples. If your database chooses to use such an implementation chances are high you will receive an ordered result set. But there are other implementations possible and your database may change to using one of those at any time. And if it does you will receive the same rows but in a different order.So in short, if you want your result set to be ordered use an
ORDER BY
.Regarding the example you mentioned it is perhaps a bit misleading. The result is ordered by the columns it is also grouped by. This is a possible ordering and given the thoughts before probably quite a likely one. But since the query is without an
ORDER BY
this is just coincidence and not guaranteed.