Postgresql - Can I count on a simple case/switch plpgsql function to use a sane query plan?

Him
December 30, 2023
152 views
0 votes
2 Answers

I am trying to implement a Postgres function with optional parameters. I tried a stable PL/pgSQL function with required parameters, splitting cases along user input in an IF construct.
Here is a simplified example:

Setup:

CREATE TABLE IF NOT EXISTS test_tab1(
    field1 int PRIMARY KEY,
    field2 int
);
CREATE TABLE IF NOT EXISTS test_tab2(
    field1 int PRIMARY KEY,
    field2 int
);
INSERT INTO test_tab1 SELECT FLOOR(RANDOM()*10000), FLOOR(RANDOM()*10000) FROM GENERATE_SERIES(1,10);
INSERT INTO test_tab2 SELECT * FROM test_tab1;
    
CREATE OR REPLACE FUNCTION test_tab1_func()
    RETURNS SETOF test_tab1
    LANGUAGE sql STABLE 
    AS $$
        SELECT * FROM test_tab1;
$$;

CREATE OR REPLACE FUNCTION test_tab2_func()
    RETURNS SETOF test_tab1
    LANGUAGE sql STABLE 
    AS $$
        SELECT * FROM test_tab2;
$$;

The function that is the subject of my question:

CREATE OR REPLACE FUNCTION test_func(foo int DEFAULT null)
  RETURNS SETOF test_tab1
  LANGUAGE plpgsql STABLE AS
$$
BEGIN
    IF (foo IS null) THEN
        RETURN QUERY
        SELECT * FROM test_tab1_func();
    ELSE
        RETURN QUERY
        SELECT * FROM test_tab2_func();
    END IF;
END;
$$;

EXPLAIN SELECT * FROM test_func();

EXPLAIN returns:

Function Scan on test_func  (cost=0.25..10.25 rows=1000 width=8)

My experiments with more complex queries show that this Function Scan estimate is not very "smart", and will return basically this same result no matter the complexity of the function body. Since the EXPLAIN output is so uninformative, I have come here for more info on the execution plan.

My question is this: Since the DB presumably can plan for test_tab1_func, and it can plan for test_tab2_func, can I count on the DB to use the plan that it would ordinarily devise for test_tab1_func for test_func when foo is null, and the plan that it would ordinarily devise for test_tab2_func for test_func otherwise? If not, is there a better way to implement this sort of "case/switch" function logic that Postgres can more readily plan around?

Answers

Your example updated to include a difference in outcomes for the functions:

CREATE TABLE IF NOT EXISTS test_tab1(
    field1 serial PRIMARY KEY,
    field2 int
);
CREATE TABLE IF NOT EXISTS test_tab2(
    field1 serial PRIMARY KEY,
    field2 int
);

INSERT INTO test_tab1(field2) SELECT  FLOOR(RANDOM()*100000000) FROM GENERATE_SERIES(1,1000000);
INSERT INTO test_tab2 SELECT * FROM test_tab1;

CREATE OR REPLACE FUNCTION test_tab1_func()
    RETURNS SETOF test_tab1
    LANGUAGE sql STABLE 
    AS $$
        SELECT * FROM test_tab1 where field1 = 11;
    $$;


CREATE OR REPLACE FUNCTION test_tab2_func()
    RETURNS SETOF test_tab1
    LANGUAGE sql STABLE 
    AS $$
        SELECT * FROM test_tab2 where field1 < 100000;
    $$;

CREATE OR REPLACE FUNCTION test_func(foo int DEFAULT null)
    RETURNS SETOF test_tab1
    LANGUAGE plpgsql STABLE 
    AS $$
        BEGIN
            IF (foo IS null) THEN
                RETURN QUERY
                SELECT * FROM test_tab1_func();
            ELSE
                RETURN QUERY
                SELECT * FROM test_tab2_func();
            END IF;
        END;
$$;

EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test_func();
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Function Scan on public.test_func  (cost=0.25..10.25 rows=1000 width=8) (actual time=0.375..0.376 rows=1 loops=1)
   Output: field1, field2
   Function Call: test_func(NULL::integer)
 Planning Time: 0.043 ms
 Execution Time: 0.957 ms

EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test_func(1);
                                                        QUERY PLAN                                                         
---------------------------------------------------------------------------------------------------------------------------
 Function Scan on public.test_func  (cost=0.25..10.25 rows=1000 width=8) (actual time=99.505..159.962 rows=999999 loops=1)
   Output: field1, field2
   Function Call: test_func(1)
 Planning Time: 0.016 ms
 Execution Time: 185.767 ms

Adding VERBOSE and ANALYZE shows that Postgres differentiates between the two functions and tracks what running each function on it’s own does:

EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test_tab1_func();
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Index Scan using test_tab1_pkey on public.test_tab1  (cost=0.42..8.44 rows=1 width=8) (actual time=0.006..0.007 rows=1 loops=1)
   Output: test_tab1.field1, test_tab1.field2
   Index Cond: (test_tab1.field1 = 11)
 Planning Time: 0.055 ms
 Execution Time: 0.014 ms
(5 rows)

EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test_tab2_func();
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Seq Scan on public.test_tab2  (cost=0.00..16925.00 rows=999999 width=8) (actual time=0.005..74.222 rows=999999 loops=1)
   Output: test_tab2.field1, test_tab2.field2
   Filter: (test_tab2.field1 < 1000000)
   Rows Removed by Filter: 1
 Planning Time: 0.070 ms
 Execution Time: 96.227 ms

- ErwinBrandstetter
- December 30, 2023 at 4:23 am
- 0 votes
0
Can I count on a simple case/switch plpgsql function to use a sane query plan?

Yes, you can always count on "sane" query plans – in plain SQL or nested in any function. It’s always the same query planner, just called from a different context. An "insane" query plan would be cause for a bug report to the PostgreSQL Bug List <[email protected]> – after making sure the insanity wasn’t all yours.

No, you cannot count on a sane query plan, since every nested SQL statement in a function gets its own plan – unless the function is inlined as a whole, which is only possible for SQL functions (LANGUAGE sql) that fulfill a couple of conditions. So not applicable to PL/pgSQL functions.

Your call results in one plan for the calling SELECT * FROM test_func(); and another, separate (nested) plan for SELECT * FROM test_tab1_func();.
SELECT * FROM test_tab2_func(); in the other leg of the IF construct is not planned at all until control actually reaches it.

To inspect every detail of what’s happening behind the curtains, use auto_explain. It’s simple, but you must be superuser. See:
- Get query plan for SQL statement nested in a PL/pgSQL function
Related:
- LANGUAGE SQL vs LANGUAGE plpgsql in PostgreSQL functions
Your example

Splitting into distinct queries along user input can make sense if that results in a more efficient query plan for at least one branch. But that’s not the case for your simple example, where the added overhead far outweighs the possible gain.

Your function test_func() and call can be replaced with plain SQL:
```
SELECT * FROM test_tab1_func() WHERE $foo IS NULL
UNION ALL
SELECT * FROM test_tab2_func() WHERE $foo IS NOT NULL;
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – Can I count on a simple case/switch plpgsql function to use a sane query plan?

Answers

Your example