skip to Main Content

In my latest project I have been migrating my database which was using SQLite to a PostgreSQL database. I have been updating my SQL code to work with PostgreSQL, but have been struggling with one difference between SQLite and PostgreSQL, which is grouping.

For example, if I wanted to have multiple values in the SELECT clause, but only wanted to group on the one field, it was very easy in SQLite. This query works as intended in SQLite.

item_id is a foreign key

SELECT item_id, avg_price, min_price
FROM App_price
GROUP BY item_id

However executing the same query in PostgreSQL shows the error:

column "App_price.avg_price" must appear in the GROUP BY clause or be used in an aggregate function

The only field I want to group by is item_id, so I wont want to put avg_price and min_price in the GROUP BY clause. I also do not want to use any aggregate functions on these fields either. What is the solution to this?

2

Answers


  1. The equivalent behavior, given that SQLite returns a random row, can be acheived with:

    SELECT DISTINCT ON (item_id) item_id, avg_price, min_price
      FROM App_price
     ORDER BY item_id;
    

    As @NickW pointed out, SQLite’s behavior is non-standard. It also isn’t logical: selecting a random row is not grouping.

    Login or Signup to reply.
  2. Postgres Group by

    When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

    Sqlite Group by 2.5. Bare columns in an aggregate query

    If there is exactly one min() or max() aggregate in the query, then all bare columns in the result set take values from an input row which also contains the minimum or maximum. So in the query above, the value of the "b" column in the output will be the value of the "b" column in the input row that has the largest "c" value. There are limitations on this special behavior of min() and max():

    If the same minimum or maximum value occurs on two or more rows, then bare values might be selected from any of those rows. The choice is arbitrary. There is no way to predict from which row the bare values will be choosen. The choice might be different for different bare columns within the same query.

    If there are two or more min() or max() aggregates in the query, then bare column values will be taken from one of the rows on which one of the aggregates has their minimum or maximum value. The choice of which min() or max() aggregate determines the selection of bare column values is arbitrary. The choice might be different for different bare columns within the same query.

    This special processing for min() or max() aggregates only works for the built-in implementation of those aggregates. If an application overrides the built-in min() or max() aggregates with application-defined alternatives, then the values selected for bare columns will be taken from an arbitrary row.

    Most other SQL database engines disallow bare columns. If you include a bare column in a query, other database engines will usually raise an error. The ability to include bare columns in a query is an SQLite-specific extension.

    So something like:

    SELECT item_id, min(avg_price), min(min_price)
    FROM App_price
    GROUP BY item_id
    
    

    I know you said you didn’t want aggregate functions but that is what you will have to do. Even if you resorted to as sub-query the avg_price and min_price would basically be some essentially random value. Bottom line for a given grouping field(item_id) any non-grouped values are just going to be picked out of hat and not really useful.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search