skip to Main Content

I am trying to combine stv_tbl_perm,pg_class and stl_query to get queries ran against tables with 0 rows using EXISTS.

Current Query

SELECT z.* 
FROM stl_query WHERE EXISTS (
    SELECT b.relname FROM (SELECT db_id, id, name, MAX(ROWS) rows_all_dist, SUM(ROWS) "rows"
                           FROM stv_tbl_perm
                           GROUP BY db_id, id, name) a
    INNER JOIN pg_class b ON b.oid = a.id
    WHERE CASE WHEN b.reldiststyle = 8 
               THEN a.rows_all_dist 
               ELSE a.rows 
          END = 0
    AND b.relname LIKE '%' || z.querytxt || '%')
                       

Error

XX000: This type of correlated sub query pattern is not supported due to internal error

Sub Query

Ran alone this brings back all tables which contain 0 rows

SELECT b.relname FROM (SELECT db_id, id, name, MAX(ROWS) rows_all_dist, SUM(ROWS) "rows"
                       FROM stv_tbl_perm
                       GROUP BY db_id, id, name) a
INNER JOIN pg_class b ON b.oid = a.id
WHERE CASE WHEN b.reldiststyle = 8 
           THEN a.rows_all_dist 
           ELSE a.rows 
      END = 0

What I am trying to do is combine this with stl_query to see if any of the tables with 0 rows are being queried.

2

Answers


  1. Chosen as BEST ANSWER

    Because there is no way to combine the 3 tables to use EXISTS to look for tables with 0 rows I used the below (Which is not optimised nor elegant) to get the results I needed.

    NOTE: I used 14 "levels" of SPLIT_PART as that was the most amount of tables used in a single query for my results, this can be increased/decreased depending on your use case)

    CODE

    SELECT relname
    FROM (SELECT db_id, id, name, MAX(ROWS) rows_all_dist, SUM(rows) "rows" 
          FROM stv_tbl_perm
          GROUP BY db_id, id, name) a
    INNER JOIN pg_class pgc ON pgc.oid = a.id
    WHERE CASE WHEN pgc.reldiststyle = 8 THEN a.rows_all_dist ELSE a.rows END = 0
    AND relname in (SELECT DISTINCT REPLACE(REPLACE(a, ')', ''), '()', '') 
                    FROM (SELECT split_part(split_part(querytxt, 'from ', 2), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 3), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 4), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 5), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 6), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 7), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 8), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 9), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 10), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 11), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 12), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 13), ' ', 1) a 
                          FROM stl_query
                          UNION ALL
                          SELECT split_part(split_part(querytxt, 'from ', 14), ' ', 1) a 
                          FROM stl_query)
                         )
    

  2. I’m assuming that the question you are implying is ‘why am I getting this error?’

    These correlated subquery errors are caused by referencing the outer query’s data inside the subquery – AND b.relname LIKE ‘%’ || z.querytxt || ‘%’ (assuming that z is stl_query as you didn’t define z in query provided). This query structure is very expensive on clustered databases like Redshift – for each row of stl_query re-evaluate the subquery to see if a row exists. Don’t write queries in this way for Redshift.

    There are plenty of examples of this issue being addressed on SO and you can reference those. (Here’s one I answered just last week – ERROR: This type of correlated subquery pattern is not supported yet). Basically you need to rewrite the EXISTS structure into an INNER JOIN.

    Also, you got what will become your join-on condition backwards. b.relname is the table name but z.querytxt is the first 4000 characters of the query so relname will be in querytxt (if it isn’t too long). So you need

    z.querytxt LIKE '%' || b.relname || '%')
    

    Lastly, this query is likely to be slow and unreliable due to only having the first 4000 chars of the query text. You will likely be better off starting with stl_scan as this will give the table id of all tables scanned by a query. It also gives rows processed which would be a good start as to zero row tables but would also included tables where no rows were scanned due to WHERE clauses. So you would still need to look at tbl_perm to confirm zero row tables.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search