skip to Main Content

I have two models User and Subscription as follows

class User < ApplicationRecord
  has_many :subscriptions
end

class Subscription < ApplicationRecord
  belongs_to :user
  
  # attributes
  # name string
  # premium boolean
  # ...
end

Users will be having many subscription. How Can I find all the users who does not have a premium subscription (premium =true)

User.where('id not in (select user_id from subscriptions where premium = true)'). This works for me with subquery. Is there a better way of doing the same with joins?

2

Answers


  1. As @The russian shame said your query perfomance is good enough

    I just add another query with EXISTS to compare

    You can use where_exists gem for this purpose as

    User.where_not_exists(:subscriptions, premium: true)
    
    EXPLAIN ANALYZE SELECT * FROM users WHERE (NOT (EXISTS (SELECT 1 FROM subscriptions WHERE (subscriptions.user_id = users.id) AND premium = TRUE)));
                                                            QUERY PLAN                                                        
    --------------------------------------------------------------------------------------------------------------------------
     Hash Anti Join  (cost=123.50..153.00 rows=9 width=24) (actual time=2.652..3.357 rows=91 loops=1)
       Hash Cond: (users.id = subscriptions.user_id)
       ->  Seq Scan on users  (cost=0.00..17.00 rows=1000 width=24) (actual time=0.020..0.264 rows=1000 loops=1)
       ->  Hash  (cost=92.00..92.00 rows=2520 width=8) (actual time=2.617..2.618 rows=2520 loops=1)
             Buckets: 4096  Batches: 1  Memory Usage: 131kB
             ->  Seq Scan on subscriptions  (cost=0.00..92.00 rows=2520 width=8) (actual time=0.013..1.653 rows=2520 loops=1)
                   Filter: premium
                   Rows Removed by Filter: 2480
     Planning Time: 0.504 ms
     Execution Time: 3.409 ms
    (10 rows)
    
    EXPLAIN ANALYZE SELECT * FROM users WHERE users.id NOT IN (SELECT user_id FROM subscriptions WHERE premium = TRUE);
                                                          QUERY PLAN                                                      
    ----------------------------------------------------------------------------------------------------------------------
     Seq Scan on users  (cost=98.30..117.80 rows=500 width=24) (actual time=0.722..0.878 rows=91 loops=1)
       Filter: (NOT (hashed SubPlan 1))
       Rows Removed by Filter: 909
       SubPlan 1
         ->  Seq Scan on subscriptions  (cost=0.00..92.00 rows=2520 width=8) (actual time=0.004..0.415 rows=2520 loops=1)
               Filter: premium
               Rows Removed by Filter: 2480
     Planning Time: 0.055 ms
     Execution Time: 0.899 ms
    (9 rows)
    
    Login or Signup to reply.
  2. You’re better using the query you currently have.

    Take a look at two other different ways to achieve what you need, their execution time is slightly different (there are many other ways to get this done, but I got no time);

    EXPLAIN ANALYZE SELECT * FROM users WHERE users.id NOT IN (SELECT user_id FROM subscriptions WHERE premium = TRUE);
                                                        QUERY PLAN
    ------------------------------------------------------------------------------------------------------------------
     Seq Scan on users  (cost=19.79..41.91 rows=485 width=56) (actual time=0.025..0.032 rows=2 loops=1)
       Filter: (NOT (hashed SubPlan 1))
       Rows Removed by Filter: 2
       SubPlan 1
         ->  Seq Scan on subscriptions  (cost=0.00..18.70 rows=435 width=8) (actual time=0.004..0.008 rows=2 loops=1)
               Filter: premium
               Rows Removed by Filter: 2
     Planning Time: 0.051 ms
     Execution Time: 0.053 ms
    (9 rows)
    
    EXPLAIN ANALYZE SELECT * FROM users LEFT JOIN subscriptions ON subscriptions.premium = TRUE AND subscriptions.user_id = users.id WHERE subscriptions.id IS NULL;
                                                       QUERY PLAN
    -----------------------------------------------------------------------------------------------------------------
     Hash Right Join  (cost=31.83..51.67 rows=5 width=56) (actual time=0.045..0.055 rows=2 loops=1)
       Hash Cond: (subscriptions.user_id = users.id)
       Filter: (subscriptions.id IS NULL)
       Rows Removed by Filter: 2
       ->  Seq Scan on subscriptions  (cost=0.00..18.70 rows=435 width=16) (actual time=0.004..0.008 rows=2 loops=1)
             Filter: premium
             Rows Removed by Filter: 2
       ->  Hash  (cost=19.70..19.70 rows=970 width=56) (actual time=0.021..0.024 rows=4 loops=1)
             Buckets: 1024  Batches: 1  Memory Usage: 9kB
             ->  Seq Scan on users  (cost=0.00..19.70 rows=970 width=56) (actual time=0.007..0.013 rows=4 loops=1)
     Planning Time: 0.081 ms
     Execution Time: 0.077 ms
    (12 rows)
    
    EXPLAIN ANALYZE WITH f AS (SELECT user_id FROM subscriptions WHERE premium = TRUE) SELECT * FROM users WHERE id NOT IN (SELECT user_id FROM f);
                                                        QUERY PLAN
    ------------------------------------------------------------------------------------------------------------------
     Seq Scan on users  (cost=19.79..41.91 rows=485 width=56) (actual time=0.023..0.029 rows=2 loops=1)
       Filter: (NOT (hashed SubPlan 1))
       Rows Removed by Filter: 2
       SubPlan 1
         ->  Seq Scan on subscriptions  (cost=0.00..18.70 rows=435 width=8) (actual time=0.004..0.008 rows=2 loops=1)
               Filter: premium
               Rows Removed by Filter: 2
     Planning Time: 0.062 ms
     Execution Time: 0.050 ms
    (9 rows)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search