redshift_connector
is defined to be aligned with https://peps.python.org/pep-0249/#id24 but I can’t, after calling procedure, retrieve data into a dataframe.
Instead I’m getting 'mycursor'
value. How to overcome this?
fetch*()
methods don’t allow passing argument that will allow getting into data in mycursor
.
I also tried RECORDS type but no luck.
Procedure’s body:
--CREATE TABLE reporting.tbl(a int, b int);
--INSERT INTO reporting.tblVALUES(1, 4);
CREATE OR REPLACE PROCEDURE reporting.procedure(param IN integer, rs_out INOUT refcursor)
LANGUAGE plpgsql
AS $$
BEGIN
OPEN rs_out FOR SELECT a FROM reporting.tbl;
END;
$$;
Python code:
import redshift_connector
conn = redshift_connector.connect(
host='xyz.xyz.region.redshift.amazonaws.com',
database='db',
port=5439,
user="user",
password='p@@s'
)
cursor = conn.cursor()
cursor.execute("BEGIN;")
res = cursor.callproc("reporting.procedure", parameters=[1, 'mycursor'])
res = cursor.fetchall()
cursor.execute("COMMIT;")
#returns (['mycursor'],)
print(res)
2
Answers
I modified Bill's solution by:
Adding context manager class:
Below tested and I confirm that data is fetched without any temp tables.
I think you are trying to define 2 cursors and only one is allowed. "conn.cursor()" creates a cursor with name defined by redshift_connector. "OPEN rs_out FOR SELECT a FROM reporting.tbl;" in your procedure opens a second cursor with the name mycursor. The "cursor.fetchall()" is trying to fetch from the first cursor (and possibly erroring). No command is fetching from mycursor.
I don’t believe there is a way to get "cursor.fetchall()" to point to a different cursor name so I think you need to run the SQL commands (CALL, FETCH, etc) directly.
Something like this:
Be aware that if you are on a single node Redshift cluster FETCH ALL isn’t allowed and you will need to use FETCH FORWARD instead.
Above untested and off the cuff.