skip to Main Content

I have one table as below which stores employees details

empid attribute value start_date end_date
E1 Active FALSE 2020-01-01 2022-05-05
E1 Active TRUE 2022-06-06 9999-12-31
E1 Branch NYC 2022-01-01 9999-12-31
E2 Active TRUE 2020-01-01 9999-12-31

Then another table which have address details

empid city
E1 CON

What I want is

  1. List of all the employee ids which do not have entries in address table

AND

  1. All those employees whose current value(today’s value) of attribute ACTIVE is TRUE.

I have written below query but its taking too much time to execute (3-4mins).
Is there any way i can optimize this query.

select distinct(emp.empid) from schema1.employees emp, schema2.address add 
where emp.empid = add.empid 
  and (emp.attribute ='ACTIVE' and emp.val in ('TRUE') 
  and CURRENT_DATE BETWEEN emp.start_date and emp.end_date)
  and emp.emp_id not in (select empid from schema2.address add2)
limit 20

2

Answers


  1. I made only minor modifications to your query that could help performance.

    select emp.empid
    from schema1.employees emp
    left join schema2.address add on emp.empid = add.empid 
    where emp.attribute ='ACTIVE' 
      and emp.val = 'TRUE'
      and emp.start_date <= CURRENT_DATE 
      and emp.end_date >= CURRENT_DATE 
      and add.empid is null;
    

    The following indexes that can be useful as well. If you don’t already have them, add them and try again.

    create index ix1 on schema1.employees (attribute, val, start_date, end_date);
    
    create index ix2 on schema2.address (empid);
    
    Login or Signup to reply.
  2. From the description it’s unclear how your two conditions should be combined.

    To get all employees that pass either of your two conditions:

    SELECT e.empid
    FROM   schema1.employees e
    WHERE  e.attribute = 'ACTIVE'
    AND    e.val  -- = true
    AND    CURRENT_DATE BETWEEN e.start_date AND e.end_date
    OR     NOT EXISTS (SELECT FROM schema2.address a WHERE a.empid = e.empid);
    

    To get all employees that pass both of your two conditions:

    ...
    AND    NOT EXISTS (SELECT FROM schema2.address a WHERE a.empid = e.empid);
    

    Assuming you only added DISTINCT (incorrectly with parentheses) to deal with duplicates introduced by the join. Not duplicating rows with an EXISTS expression in the first place, we also don’t need DISTINCT.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search