skip to Main Content
employe_id skill_level skill_id
1550 BEGINNER 560
6540 BEGINNER 560
2354 INTERMEDIATE 560
6654 ADVANCED 560
1550 ADVANCED 780
6540 BEGINNER 780
1550 INTERMEDIATE 780
2354 INTERMEDIATE 780
1550 INTERMEDIATE 450
6540 BEGINNER 654
8888 BEGINNER 560
6654 ADVANCED 455
1550 ADVANCED 110
6540 ADVANCED 885
2354 ADVANCED 980
6654 INTERMEDIATE 870

I want to only get employees with specific skills and their specific respective levels; I would get something like this:

employe_id skill_level skill_id
1550 BEGINNER 560
1550 INTERMEDIATE 780

I tried this but obviously it’s not what I want because it has an inclusive OR, so I don’t know which operator / technique should I use

select * 
from employees_skills mec
where (mec.skill_id, mec.skill_level) = (560, 'BEGINNER') 
or (mec.skill_id, mec.skill_level) = (780, 'INTERMEDIATE')

If I do this for a set of two skills (and their resp. levels), I’ll be able to do it for more.

EDIT: the employee 2354 should not be returned (even though they are beginner at 560, but they don’t have the other skill 780 and/or they’re not intermediate at it).

I expect employees who have ALL the skills in the WHERE conditions and their resp. levels

2

Answers


  1. You need some extra parens ():

    SELECT * 
    FROM employees_skills mec
    WHERE ((mec.skill_id, mec.skill_level) = (560, 'BEGINNER'))
    OR ((mec.skill_id, mec.skill_level) = (423, 'INTERMEDIATE'));
    

    This creates a tuple and that’s what you’re looking for.

    Login or Signup to reply.
  2. The first issue is you are looking for the combination skill combination (423, ‘INTERMEDIATE’), however that combination does not exist in your data.
    Your scenario presents an interesting dilemma: Your results must have multiple skill_id, skill_level combinations, but a single row can only have a single value set of skill_id, skill_level. So how to compare multiple rows to multiple values at the same time. Well you could use a series of and exists (select ...). However, this requires knowing the number of combinations or dynamic sql to construct the exist clauses. Another, is to use array comparison, specifically the array containment operator

    anyarray @> anyarray → boolean
    Does the first array contain the second, that is, does each element appearing in the second array equal some element of the first
    array?

    First build UDT that containers both skill_id and skill_level. Then create a table (could be a temporary table) to contain an array of the UDT.

    -- Create a UDT to contain both skill_id and skill_level
    create type skill_level_t as (sk_id integer, sk_level text); 
        
    -- create a work table to contain Arrays of target skill_id, skill_level combinations
     create table target_skills(tgt_skill_id integer generated always as identity
                                             primary key 
                               , skills      skill_level_t[]
                               ); 
    

    With the above in place you now aggregate the employee skills set by employee. Then JOIN that aggregate to the target skills with the contains operator (@>). (see demo)

    -- target query
    select emp_id           "Employee Id" 
         , (slist).sk_level "Skill Level"
         , (slist).sk_id    "Skill Id" 
      from ( select emp_id, unnest(skills) slist 
               from ( select es.emp_id,ts.skills
                       from ( select emp_id, array_agg( (skill_id, skill_level )::skill_level_t) skills 
                                from emp_skills
                              group by emp_id
                            )  es
                       join target_skills ts 
                         on  es.skills @> ts.skills
                     ) sq1
            ) sq2;
    

    NOTE: The final query can probably be refactored and reduced, but I wanted to show each step in its development.


    EDIT on question: Isn’t there a simpler / faster way?

    As I said it can probably be reduced. But look carefully you will see that the section ( select es ... ) s1 is the only part actually accessing the database. The others are just simple transformations of the results from previous step that is already in memory thus requiring no actual additional i/o. As mentioned you could use a set of and exists (select ... but number of selects needed is the same as the number (mec.skill_id, mec.skill_level) criteria sets (5 criteria then 5 selects, 10 criteria then 10 selects, 50 criteria well you get the idea). Each accessing the database with likely additional i/o. The above does not change regardless of the number of criteria (5, 10, 50 the query remains the same). Also combining simple transformations can into a single large transform can get complicated very quickly, just stringing the together seldom works. Obviously my opinion but isn’t there a simpler/faster way? No. One option may be to hide the query within a SQL function.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search