My Postgres database was migrated from MySQL using a tool and the code base has lot of syntax issues.
One of the issues is with the UPDATE statements inside procedures where the column name contains alias name as below.
UPDATE table1 t1 SET t1.col1 = 'some_value';
Having alias name after SET
keyword as in t1.col1
is a wrong syntax in Postgres.
As the number of procedures is huge, I’m trying to write a regular expression to find which procedures have this pattern.
select proname, prosrc from pg_proc
where regexp_replace(prosrc, E'[\n\r]+', ' ', 'g' ) ~* '[:UPDATE:]([[:space:]]+)[:set:]([[:space:]]+)^[a-z]([a-z]|[0-9])+.^[a-z]([a-z]|[0-9])+([[:space:]]*)[:=:]';
The regexp_replace
part on the left side of the condition is to remove line breaks which works fine. The main part on the right side is not returning the desired result.
I am trying to find the procedures that has UPDATE
keyword followed by one or more space, followed by SET keyword, followed by one more space, followed by one more alphanumeric characters (which starts with an alphabet), followed by a dot(.) , followed by one more alphanumeric characters (which starts with an alphabet), followed by zero or more spaces, followed by an equal to sign (=).
But the statement I formed seems to be wrong. Any help on this is much appreciated.
5
Answers
The below query gives me the expected results. It checks for code where we have
SET
followed by one or more space, followed by one or more alphanumeric character and_
, followed by a dot(.
), followed by one or more alphanumeric character and_
, followed by one or more spaces and followed by=
. This fetches all the procedures that have the issue that I posted in question.Yes, in PostgreSQL this is not working:
But, this is working correctly:
So we only need to clear the update field alias.
Example for do it:
For finding, selecting:
I think this may be more complex than you think… A procedure/function may have more than one update statement, and a simple regex will likely come up with many false positives.
I think you want a function to do a better job of eliminating false positives that result from:
FROM
orWHERE
clause, which are valid and not syntax errorsupdate
keywordHere is a boilerplate for what I think will get you close and minimize false positives:
So to get the results simply:
I did a test run, and your function did show up in the results.
Some small changes: