skip to Main Content

I want to count certain values in all tables of a schema that contain a column that can contain those values.

Was hoping to use a LATERAL join to loop over all tables, but I’m running into issues:

select
    fully_qualified_table_name,
    cnt
from (
    select
        'datastore.' || table_name as fully_qualified_table_name
    from
        information_schema.columns
    where
        table_schema = 'datastore'
        and column_name = 'dss_current_flag'
    
    cross join lateral
    
    select
        count(*) as cnt
    from
        information_schema.fully_qualified_table_name
    );

Is this possible?

2

Answers


  1. Chosen as BEST ANSWER

    Based on the answer by @jim-jones my final solution was

    CREATE TYPE datastore.schema_table_column_counts_type AS (
        schema_name text,
        table_name text,
        column_name text,
        value text,
        count_p bigint);
    
    CREATE OR REPLACE FUNCTION datastore.count_records_in_schema_where_column_has_value(_schema_name text, _column_name text, _value text) 
    RETURNS setof datastore.schema_table_column_counts_type language plpgsql AS $$
    DECLARE  
      rec record;
      result_record datastore.schema_table_column_counts_type;
    BEGIN   
      FOR rec IN 
        SELECT 
            table_schema AS sch,
            table_name AS tb, 
            $2 as cn, 
            $3 as v
        FROM information_schema.columns
        WHERE table_schema = $1
        AND column_name = $2
      LOOP
        EXECUTE format($ex$ 
            SELECT 
                '%1$s' as schema_name, 
                '%2$s' as table_name, 
                '%3$s' as column_name,
                '%4$s' as value,
                count(*) 
            FROM 
                %1$s.%2$s
            WHERE
                %3$s = %4$L 
            $ex$
            , rec.sch, rec.tb, rec.cn, rec.v) 
        INTO result_record;
        return next result_record;
      END LOOP;
    END $$ ;
    
    SELECT * from datastore.count_records_in_schema_where_column_has_value('datastore', 'dss_current_flag', 'P');
    

  2. I’m afraid it is not possible to run dynamic queries using pure SQL. You might wanna check PL/pgSQL instead, e.g.

    CREATE OR REPLACE FUNCTION count_records() 
    RETURNS bigint AS $$
    DECLARE  
      rec record;
      res bigint = 0; ct bigint = 0;
    BEGIN
      FOR rec IN 
        SELECT table_schema AS sch,table_name AS tb
        FROM information_schema.columns
        WHERE table_schema = 'datastore' AND column_name = 'dss_current_flag'   
      LOOP
        EXECUTE format($ex$ SELECT count(*) FROM %I.%I $ex$,rec.sch,rec.tb) 
        INTO ct;
        res := res + ct;
      END LOOP;
      RETURN res;
    END $$ LANGUAGE 'plpgsql';
    

    The more flexible approach would be to provide schema and table names as parameters in the function call instead of hard coding it in the function body, such as CREATE FUNCTION count_records(_schema_name text, _table_name text) .., or even the fully qualified table name as a single parameter: CREATE FUNCTION count_records(_qualified_table_name text) ... .

    Demo: db<>fiddle

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search