A rather simple question but for which we surprisingly didn’t found a solution.
Here is my current code, for executing a simple SQL query on a PostgreSQL database from Python 3.6.9 using psycopg2
('2.9.1 (dt dec pq3 ext lo64)'
):
import psycopg2
myid = 100
fields = ('p.id', 'p.name', 'p.type', 'p.price', 'p.warehouse', 'p.location', )
sql_query = ("SELECT " + ', '.join(fields) + " FROM product p "
"INNER JOIN owner o ON p.id = o.product_id "
"WHERE p.id = {} AND (o.dateof_purchase IS NOT NULL "
"OR o.state = 'checked_out' );"
).format(myid)
try:
with psycopg2.connect(**DB_PARAMS) as conn:
with conn.cursor(cursor_factory=DictCursor) as curs:
curs.execute(sql_query, )
row = curs.fetchone()
except psycopg2.Error as error:
raise ValueError(f"ERR: something went wrong with the query :n{sql_query}") from None
We’re more and more thinking that this is… not good. (awfully bad to be honest).
Therefore, we’re trying to use a modern f-string notation:
sql_query = (f"""SELECT {fields} FROM product p
INNER JOIN owner o ON p.id = o.product_id
WHERE p.id = {myid} AND (o.dateof_purchase IS NOT NULL
OR o.state = 'checked_out' );""")
But then, the query looks like:
SELECT ('p.id', 'p.name', 'p.type', 'p.price', 'p.warehouse', 'p.location', ) FROM ...;
which is not valid in PSQL because 1. of the brackets, and 2. of the single quoted column names.
We’d like to figure out a way to get rid of these.
In between, we went back to the doc and remembered this:
https://www.psycopg.org/docs/usage.html
Ooops! So we refactored it this way:
sql_query = (f"""SELECT %s FROM product p
INNER JOIN owner o ON p.id = o.product_id
WHERE p.id = %s AND (o.dateof_purchase IS NOT NULL
OR o.state = 'checked_out' );""")
try:
with psycopg2.connect(**DB_PARAMS) as conn:
with conn.cursor(cursor_factory=DictCursor) as curs:
# passing a tuple as it only accept one more argument after the query!
curs.execute(sql_query, (fields, myid))
row = curs.fetchone()
and mogrify()
says:
"SELECT ('p.id', 'p.name', 'p.type', 'p.price', 'p.warehouse', 'p.location', ) FROM ...;"
here again, the brackets and the single quotes are causing troubles, but no error is actually raised.
The only thing is that row
evaluates to this strange result:
['('p.id', 'p.name', 'p.type', 'p.price', 'p.warehouse', 'p.location', )']
So, how could we cleverly and dynamically build a psycopg2 query using a list of parameters for column names without neglecting the security?
(A trick could be to fetch all columns and filter them out after… but there are too many columns, some with quiet large amount of data that we don’t need, that’s why we want to run a query using a precisely defined selection of columns, which may get dynamically extended by some function, otherwise we would have hard-coded these column names of course).
OS: Ubuntu 18.04
PostgreSQL: 13.3 (Debian 13.3-1.pgdg100+1)
2
Answers
I finally found a solution. It makes use of
map
to use a list or a tuple of column names andsql.Literal
to use a given id, this is maybe cleaner:But because of that:
id
is equivalent to"id"
,name
is equivalent to"name"
for PostgreSQL,p.id
orproduct.id
instead of justid
or"id"
will miserably fail with the following error:The ‘%s’ insertion will try to turn every argument into an SQL string, as @AdamKG pointed out. Instead, you can use the psycopg2.sql module will allow you to insert identifiers into queries, not just strings: