skip to Main Content

I have this query which selects the users for a given branch.

SELECT u.first, u.last, u.id, 
    (SELECT u1.user_type
    FROM users_branches AS u1
    WHERE u.id = u1.user_id AND u1.branch_id = @foo
    LIMIT 1)
FROM users AS u
WHERE EXISTS
    (SELECT 1
    FROM users_branches AS u0
    WHERE u.id = u0.user_id AND u0.branch_id = @foo)

And here is the EF Core LINQ query which produces this SQL.

int branchId = 123;
var users = await Users
    .Where(x => x.UsersBranches.Any(ub => ub.BranchId == branchId))
    .Select(user => new
    {
        UserId = user.Id,
        FirstName = user.First,
        LastName = user.Last,
        UserType = user.UsersBranches.FirstOrDefault(x => x.BranchId == branchId).UserType
    }).ToListAsync();

I don’t like the WHERE clause needing to be duplicated just to say "get me only the users which have a users_branches record for this branch, and also return some of the data from that record". Is there a more efficient way to do this? I can imagine a solution that uses GROUP BY, but I’m not certain that would actually be any better.

3

Answers


  1. I find this version easier to read:

    SELECT users.first, users.last, users.id, MIN(users_branches.user_type)
    FROM users
    INNER JOIN users_branches ON
        users.id = users_branches.user_id
        AND users_branches.branch_id = @foo
    GROUP BY users.id
    

    as to whether it delivers better performance, that will depend on a number of factors, like indices etc. I recommend running some tests and examining the output of EXPLAIN SELECT .... for the different queries you try

    Login or Signup to reply.
  2. one way you can do this by using joins

    SELECT u.id AS UserId, u.first AS FirstName, u.last AS LastName 
    FROM users u
    INNER JOIN users_branches ub ON u.id = ub.user_id
    WHERE ub.branch_id = 123
    

    another way you can do this by using subquery

    SELECT u.id AS UserId, u.first AS FirstName, u.last AS LastName
    FROM users u 
    WHERE u.id IN (
      SELECT user_id 
      FROM users_branches 
      WHERE branch_id = 123
    )
    

    if you see main difference is that the join version queries both tables in one statement, while the subquery version filters UsersBranches first in a subquery before querying Users.

    Note:The join is generally more efficient since it allows the database to access both tables at once and optimize the query plan. The subquery can result in slower performance since it splits the query into two parts.

    — hope this helps.

    Login or Signup to reply.
  3. Maybe can you try something like this

    SELECT u.first, u.last, u.id, u0.user_type
    FROM users u
    JOIN users_branches u0 ON u.id = u0.user_id AND u0.branch_id = @foo
    WHERE NOT EXISTS (
        SELECT 1
        FROM users_branches u1
        WHERE u1.user_id = u0.user_id
        AND u1.branch_id = u0.branch_id
        AND u1.id < u0.id
    )
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search