skip to Main Content

I am working on a project where each user is assigned to its own schema, which looks like this:

user_537        (schema)
    posts       (table)
    likes       (table)
    actions     (table)
user_538        (schema)
    posts       (table)
    likes       (table)
    actions     (table)
user_539        (schema)
    posts       (table)
    likes       (table)
    actions     (table)

Now I want to query across every actions table. I would like to get the result of:

select 
    537 as user_id, 
    count(*) actions
from actions 
UNION ALL

select 
    538 as user_id, 
    count(*) actions
from actions 
UNION ALL

select 
    539 as user_id, 
    count(*) actions
from actions 
UNION ALL

How can I do that without manually typing all schemas? For now I only managed to create a select that I can run to get what I want:

SELECT distinct 'SELECT ' || REPLACE(table_schema, 'user', '' ) || ' as user_id, count(*) as actions FROM ' || table_schema || '.actions UNION ALL' AS query 
FROM information_schema.tables
where table_schema like '%user%'

But I dont like this for 2 reasons:

  • I have to manually copy output query to run it again
  • I have additional UNION ALL at the and which I have to manually remove

Instead of this I would like to have a query that iterate over each schema, run a code and unions the result.

3

Answers


  1. You should not have "many same schemas".

    (Either that, or you should not be working on that project.)

    You should have only one schema where each record in each table also contains the user id.

    That’s the way relational databases are done.

    If unsure about relational database fundamentals, then a good starting point would be Wikipedia (https://en.wikipedia.org/wiki/Relational_database)

    Look for subjects like "normalization" and "denormalization" and you will understand exactly why having "many same schemas" is terribly, terribly wrong, and why if you do this you paint yourself into a corner and you cannot proceed any further.

    It would not hurt to read a few good books on the subject, or take a course, before attempting to program relational databases.

    Login or Signup to reply.
  2. As pointed out, your database design should be revised (normalized), but as it is now you could generate your query like this:

    DO $$
    DECLARE
        l_schema_name TEXT;
        l_query TEXT := '';
        result record;
    BEGIN
        FOR l_schema_name IN SELECT schema_name FROM information_schema.schemata WHERE schema_name NOT LIKE 'pg_%' AND schema_name != 'information_schema' LOOP
            l_query := l_query  || 'select ''' || l_schema_name || ''' as user_id, count(*) actions from '|| l_schema_name ||'.actions union all ';
        END LOOP;
        l_query := regexp_replace(l_query,'sunion alls$',';','g'); 
        for result in execute l_query loop
            raise notice 'Result: %', result;
        end loop;
    END $$;
    

    Implemented as a function is:

    CREATE OR REPLACE FUNCTION my_function()
    RETURNS TABLE(user_id text, actions integer)
    AS $$
    DECLARE
        l_schema_name TEXT;
        l_query TEXT := '';
        result record;
    begin
        FOR l_schema_name IN SELECT schema_name FROM information_schema.schemata WHERE schema_name NOT LIKE 'pg_%' AND schema_name != 'information_schema' LOOP
            l_query := l_query  || 'select ''' || l_schema_name || ''' as user_id, count(*) actions from '|| l_schema_name ||'.actions union all ';
        END LOOP;
        l_query := regexp_replace(l_query,'sunion alls$',';','g');
        return query execute l_query;
    END;
    $$ LANGUAGE plpgsql;
    

    and then to execute

    select * from my_function();
    
    Login or Signup to reply.
  3. The following query gets the counts for actions, likes, and posts across schemas:

    WITH s AS (
      SELECT schemata.schema_name,
             regexp_substr(schemata.schema_name, 'd+$') AS user_id
        FROM information_schema.schemata
      WHERE schemata.schema_name LIKE 'user_%' ESCAPE ''
    )
    SELECT s.user_id,
           (XPATH('row/actions/text()', x.counts))[1]::text::int AS actions,
           (XPATH('row/likes/text()', x.counts))[1]::text::int AS likes,
           (XPATH('row/posts/text()', x.counts))[1]::text::int AS posts
      FROM s
      CROSS JOIN LATERAL (
        SELECT (XPATH('/table/row',
                      (QUERY_TO_XML(FORMAT($$SELECT (SELECT COUNT(*) FROM %I.actions) AS actions,
                                                    (SELECT COUNT(*) FROM %I.likes) AS likes,
                                                    (SELECT COUNT(*) FROM %I.posts) AS posts$$,
                                           s.schema_name,
                                           s.schema_name,
                                           s.schema_name
                                          ),
                                    FALSE,
                                    FALSE,
                                    ''
                                   )
                      )
                     )
               )[1] AS counts
      ) x
     ORDER BY s.user_id;
    

    If only the actions counts are required, then the query can be simplified to use a single XPATH:

    WITH s AS (
      SELECT schemata.schema_name,
             regexp_substr(schemata.schema_name, 'd+$') AS user_id
        FROM information_schema.schemata
      WHERE schemata.schema_name LIKE 'user_%' ESCAPE ''
    )
    SELECT s.user_id,
           t.actions
      FROM s
      CROSS JOIN LATERAL (
        SELECT (XPATH('/table/row/actions/text()',
                      QUERY_TO_XML(FORMAT($$SELECT COUNT(*) AS actions FROM %I.actions$$,
                                          s.schema_name),
                                   FALSE,
                                   FALSE,
                                   '')
                     )
               )[1]::text::int AS actions) t;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search