skip to Main Content

I am using PyGreSQL to execute queries against my DB. I need to insert the values to be matched at runtime, but the parameters are not being into the query correctly. If my query is:

SELECT *
FROM Database
WHERE job_level = ANY(:job_level)
AND job_family = ANY(:job_family)
AND cost_center = ANY(:cost_center)

And I wish to use parameters

{'cost_center': '*',
 'job_family': 'SDE',
 'job_level': ['5', '6', '4', '7'],
}

How do I correctly insert these values via

cursor.execute(query, parameters)

I have tried:
Not using the ANY operator, but just using IN. This fails with

ERROR:  syntax error at or near ":"
LINE 4:                 WHERE job_level IN :job_level

Using %(job_level)s syntax instead of the :job_level. This fails with

ProgrammingError: ERROR:  syntax error at or near "ANY"
LINE 4:                 WHERE job_level IN ANY(ARRAY['5','6','4','7'...

So how do I properly combine the placeholders in the query with the parameters to match all three cases here, a single value, a list of values, and a wildcard matching an entire column?

2

Answers


  1. Chosen as BEST ANSWER

    Through much trial and error, I have found the correct combination of placeholders and parameter formats to match all my use cases: single values, wildcards for an entire column, and lists of individual values. This may not be the only solution, but it works.

    Placeholders in the query should be formatted using %()s syntax, inside the ANY operator:

    query = '''
    SELECT *
    FROM Database
    WHERE job_level = ANY(%(job_level)s)
    AND job_family = ANY(%(job_family)s)
    AND cost_center = ANY(%(cost_center)s)
    '''
    

    Single and multi-value parameters need to be formatted as lists (for single values, that's probably a side effect of using ANY). For wildcard matches, you can use the name of the column itself, just as with the IN operator, but this column name must be wrapped in the pgdb.Literal() function. Thus, to match my example selection criteria:

    {'cost_center': '*', # All cost centers (wildcard)
     'job_family': 'SDE', # Only this single job family
     'job_level': ['5', '6', '4', '7'], # Any of these multiple job levels
    }
    

    The complete parameter dictionary for my example should be formatted as:

    parameters = {
    'cost_center': [pgdb.Literal('cost_center')],
    'job_family': ['SDE'],
    'job_level': ['5','6','4','7']
    }
    

    Which can be safely executed with cursor.execute(query, parameters)

    This will allow matching of single values, multiple values, and wildcard matches.


  2. The following example code shows how to do this with PyGreSQL.

    Some remarks upfront:

    • It looks like your table is called database. You should rename it, because database is a keyword in SQL, and database objects are different from table objects, so it can be confusing to read such SQL. In the example I called the table jobs.
    • You are using * as wildcard, but SQL uses % instead. You can also use regular expressions in PostgreSQL, but then you need to write .* instead of *. They also have a different syntax. See Pattern Matching for the available options. In the example I used % as pattern with LIKE.
    • Of course, since * (%) matches any value, you can also simply remove that comparison from the WHERE clause in this case.

    Example using the DB API 2 of PyGreSQL

    import pgdb
    
    con = pgdb.connect(...)  # fill in connection parameters
    cur = con.cursor()
    
    query_string = '''
    SELECT * FROM jobs
    WHERE cost_center LIKE %(cost_center)s
    AND job_family = %(job_family)s
    AND job_level = ANY(%(job_level)s)
    '''
    
    parameters = dict(
        cost_center='%',
        job_family='SDE',
        job_level=['5', '6', '4', '7'])
    
    rows = cur.execute(query_string, parameters)
    
    for row in rows:
        print(row)
    
    cur.close()
    con.close()
    

    Example using the classic API of PyGreSQL

    import pg
    
    db = pg.DB(...)  # fill in connection parameters
    
    query_string = '''
    SELECT * FROM jobs
    WHERE cost_center LIKE %(cost_center)s
    AND job_family = %(job_family)s
    AND job_level = ANY(%(job_level)s)
    '''
    
    parameters = dict(
        cost_center='%',
        job_family='SDE',
        job_level=['5', '6', '4', '7']
    )
    
    query = db.query_formatted(query_string, parameters)
    rows = query.namedresult()
    
    for row in rows:
        print(row)
    
    db.close()
    

    Using a parameter tuple instead of a dict

    Both APIs also allow passing parameters as tuples instead of dicts. But dicts are more descriptive, allow reusing parameters in several places in the query, and you do not need to care about the order of the parameters. So I recommend using tuples only for very simple queries.

    When using tuples, the query_string and the parameters should loke like this:

    query_string = '''
    SELECT * FROM jobs
    WHERE cost_center LIKE %s
    AND job_family = %s
    AND job_level = ANY(%s)
    '''
    
    parameters = ('%', 'SDE', ['5', '6', '4', '7'])
    

    Using bind parameters directly

    The classic API also allows using unformatted (raw) queries with bind parameters $1, $2 etc. directly as they are written in PostgresQL.

    For simple data types this works just like passing tuples as above, but other data types like arrays need to be adapted manually. This can be done as follows:

    query_string = '''
    SELECT * FROM jobs
    WHERE cost_center LIKE $1
    AND job_family = $2
    AND job_level = ANY($3)
    '''
    
    adapt = pg.Adapter(db).adapt
    
    parameters = ('%', 'SDE', adapt(['5', '6', '4', '7']))
    
    query = db.query(query_string, parameters)
    rows = query.namedresult()
    

    However, the recommended usage is with formatted querys, where PyGreSQL automatically adapts input and typecasts output.

    In most cases, like when passing a list, PyGreSQL can guess the corresponding type in the database, like array in this case. For more exotic data types, you must specify the database type of the parameters explicitly in the adapt or query_formatted methods. Another way to adapt parameters is wrapping them as type objects, which is also supported in the DB API 2.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search