skip to Main Content

Using postgres 11 I have a data table int_state that looks like this:

dev int
sw01 gi1
sw01 gi2
sw01 gi3
sw01 gi4
sw02 gi3
sw02 gi4
sw02 gi5
sw03 gi3
sw03 gi4
sw03 gi6
sw03 gi7

I need a query that will return a single row for every instance of dev where int is the same across every instance of dev. Using the sample above query should only return "gi3" as the value for int since that value is shared between all instances of dev and result is limited is a single row for every dev. Desired results from sample data above:

dev int
sw01 gi3
sw02 gi3
sw03 gi3

The number of dev instances (x) to match against will vary, but results should always contain x rows, with a row for every dev asked for in the query. If there are multiple matches against int the query should only return the first one found.

Using a JOIN doesn’t seem like it will work for me, since I need x rows of results, not a single row with all results combined. I have been playing around with CTEs and UNION but so far unable to get it working. I thought nesting CTEs would help, but apparently that’s not supported in postgres.

I plan on constructing this query on-demand in code ahead of time, so if I have to construct something that contains where dev in ('sw01','sw02','x','y','z') that’s okay.

3

Answers


  1. WITH distinctdevs AS (SELECT DISTINCT dev FROM int_state) 
    SELECT distinctdevs.dev, (SELECT int FROM int_state ints WHERE NOT EXISTS 
     (SELECT dev FROM distinctdevs WHERE NOT EXISTS (
       SELECT * FROM int_state WHERE 
         int_state.int = ints.int AND int_state.dev = distinctdevs.dev
       )
      ) LIMIT 1) FROM distinctdevs;
    

    You might actually get better performance if you don’t reuse the CTE on line 3, as that might trick the optimizer into thinking this is a correlated subquery rather than a constant. The core trick here is the doubly nested WHERE NOT EXISTS, converting "an int that appears for every dev" to the logically equivalent "an int for which there is no dev for which it does not appear."

    Login or Signup to reply.
  2. ok, so if I understand your question, you want to find all the instances of dev (01, 02, 03, etc), and then check that the value for int is the same for all the instances?

    You can do a clever workaround I think….

    ---- get the count of how many instances of dev there are...
        WITH distinct_dev AS (  
          SELECT  
            dev  
            , count(*) AS dev_count  
          FROM
            table
          GROUP BY
            1
        )
    ----- get the count of how many instances of int there are per dev/int
        , dev_int AS (
          SELECT 
             t.dev
             , t.int
             , COUNT(DISTINCT t2.int) as count_dev_int
          FROM
            original_table t
            LEFT JOIN
            original_table t2
            ON t.dev = t2.dev
            AND t.int = t2.int
         GROUP BY
           1, 2
        )
    ----- then from the first CTE select unique devs, and join the 2nd CTE where the dev envs = the count of int per dev env....
    SELECT
       dd.dev
       , di.int
    FROM
       distinct_dev dd
       JOIN
       dev_int di
       ON di.dev = dd.dev
       AND di.count_dev_int = dd.count_dev
    
    Login or Signup to reply.
  3. You can build CTE with an array of distinct dev values and another with array of dev values for each int using array_agg. Then JOIN then on matching arrays. (see demo)

    with dev_array as 
         ( select array_agg(distinct dev) dev 
             from data_tab
         ) 
       , int_match as 
         (select int, array_agg(dev) dev
            from data_tab
           group by int
         )
    select int, unnest(dev) dev
      from (select im.*
              from dev_array da
              join int_match im  
                on (im.dev = da.dev)
             order by im.int
             limit 1
           ) sq;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search