skip to Main Content

I have two Postgres tables: ‘user’ with unique rows, and ‘address’ where a single user can have multiple addresses. I want to join specific ‘user’ row with one row from ‘address’ where the address.id is the highest max(id).

I wrote the following query which works OK but I suspect that when there are many addresses for one user, this query will produce a really large intermediate aggregate before it returns the the final results. Is there a better way to write this query?

select "user".id, 
       "user".username,     
       "user".trader_id, 
       address.id,                                             
       address.street, 
       address.trader_id 
    from "user", address 
  where "user".id = 6 
      and  address.trader_id = "user".trader_id                         
 order by address.id desc
 limit 1;

3

Answers


  1. Your query should work just fine and be performant, especially if you use appropriate indexes, such as on trader_id for both tables. (There are certain improvements that can be made, such as dropping the second trader_id column and using an alias for one or both id columns.)

    The query will fail, however, if you want to get the information for multiple users, because only a single row will be returned due to the LIMIT 1 clause. A more general solution then, where you can get the desired information for multiple "user".ids at a time would use a window function:

    SELECT u.id AS user_id, 
           u.username,     
           trader_id, 
           a.id AS address_id,                                             
           a.street
    FROM "user" u
    JOIN (
        SELECT id, street, trader_id,
               rank() OVER (PARTITION BY trader_id ORDER BY id) AS rank
        FROM address) a USING (trader_id)
    WHERE a.rank = 1
    ORDER BY u.id;
    

    In terms of performance, you should focus on indexes rather than query structure – the query planner will deal with the latter.

    Login or Signup to reply.
  2. You can skip using order by and limit in your query by using multicolumn subquery in order to get max id from address table of all/any user.

    step 1: Write a get max address id of all user from address table .

    select sa.trader_id,max(ss.id) as id from address as sa  group by 1
    

    Step 2: put the above query inside another query to fetch necessary column of address table.

            select 
                ss.* 
            from address as ss 
            where (ss.trader_id ,ss.id) in (select sa.trader_id,max(ss.id) as id from address as sa  group by 1)
    

    Step 3: Join the above query with "user" table. You can get any users latest address id from this using where condition.

    select "user".id, 
               "user".username,     
               "user".trader_id, 
               t1.id,                                             
               t1.street, 
               t1.trader_id 
            from "user" 
            join (
                    select 
                        ss.* 
                    from address as ss 
                    where (ss.trader_id ,ss.id) in (select sa.trader_id,max(ss.id) as id from address as sa  group by 1)
                )t1 on "user".trader_id =t1.trader_id
                  where "user".id = 6;
            
    
            
    
    Login or Signup to reply.
  3. Your query is pretty good already for getting a single user. Definitely much faster than running a window function over the whole address table in a subquery.
    Your query, only slightly simplified with a USING clause:

    SELECT trader_id, u.id AS user_id, u.username, a.id AS adr_id, a.street
    FROM   "user"  u
    JOIN   address a USING (trader_id)
    WHERE  u.id = 6
    ORDER  BY a.id DESC
    LIMIT  1;
    

    A multicolumn index on address(trader_id, id) will give you optimal performance. Plus, obviously, an index on "user"(id). (No "intermediate aggregate" like you apprehended with LIMIT 1!) See:

    Alternatively, the same technique in a LATERAL subquery. Works for retrieving one or more users:

    SELECT u.trader_id, u.id AS user_id, u.username, a.*
    FROM   "user"  u
    LEFT   JOIN LATERAL (
       SELECT a.id AS adr_id, a.street
       FROM   address a
       WHERE  a.trader_id = u.trader_id
       ORDER  BY a.id DESC
       LIMIT  1
       ) a ON true
    WHERE  u.id = 6;  -- or for more than just the one
    

    About LATERAL subqueries:

    Also using LEFT JOIN to preserve users without any address.

    If you are going to use a window function, use row_number() rather than rank(). Do it in a LATERAL subquery while only retrieving a single (or few) user(s), to only involve relevant rows. And, unless you run Postgres 16 or later, add the frame clause ROWS UNBOUNDED PRECEDING for performance:

    SELECT u.trader_id, u.id AS user_id, u.username, a.*
    FROM   "user"  u
    LEFT   JOIN LATERAL (
       SELECT id AS adr_id, street
            , row_number() OVER (ORDER BY id DESC ROWS UNBOUNDED PRECEDING) AS rn
       FROM   address a
       WHERE  a.trader_id = u.trader_id
       ) a ON a.rn = 1
    WHERE  u.id = 6;  -- or for more than one user
    

    Why ROWS UNBOUNDED PRECEDING? See:

    Or use DISTINCT ON:

    SELECT DISTINCT ON (traider_id)
           trader_id, u.id AS user_id, u.username, a.id AS adr_id, a.street
    FROM   "user"  u
    JOIN   address a USING (trader_id)  -- or LEFT JOIN?
    WHERE  u.id = 6
    ORDER  BY trader_id, a.id DESC;
    

    See:

    __

    Aside: Rather don’t use reserved words like "user" as identifier.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search