In my Postgres 13 DB I want to update a column mother_tongues_id
in my table based on name columns.
Names are like “Le Minh Thien” and I want to check if those contain words like ‘le’ or ‘lee’ then set nationality to ‘Vietnam’. The words check table is some thing like this:
language | noun |
---|---|
vietnam | +lee+le+long+la+ |
chinese | +lu+zhu+ |
Here is my UPDATE
code:
UPDATE resume
SET mother_tongues_id = (
SELECT id FROM mother_tongues
WHERE check_language(resume.id) = id
limit 1);
WHERE mother_tongues_id IS NULL;
Here is my check_language()
function:
CREATE OR REPLACE FUNCTION check_language(rid integer) RETURNS integer AS
$BODY$
DECLARE i int = (SELECT (LENGTH(CONCAT(first_name, ' ',last_name)) - LENGTH(replace(CONCAT(first_name, ' ',last_name), ' ', ''))) FROM resume where id = rid);
DECLARE sname varchar = '';
DECLARE sid int = 0
LOOP
SET i = i - 1;
SET sname = SELECT split_part(SELECT CONCAT(first_name, ' ',last_name) FROM resume where id = rid, ' ', i)
if select id from mother_tongues WHERE noun ILIKE '%+sname+%' then SET sid = id;
end if;
IF i = 1 THEN
EXIT;
END IF;
END LOOP;
if sid > 0 then RETURN sid;
else RETURN NULL;
end if;
END;
$BODY$
LANGUAGE plpgsql;
I try to create a function in Postgres with loop to split the name and check the words, however always error.
2
Answers
You have some syntax errors, here is a fix, have a try again.
Your function is extremely inefficient, errors aside.
You certainly don’t need a function for this to begin with. A plain
UPDATE
with a correlated subquery matching with the array overlap operator&&
does the job:Of course, this is just putting lipstick on a pig. You should really start by fixing your DB schema. As a minimum, convert
mother_tongues.noun
totext[]
, so we don’t have to do it on the fly. And add a GIN index on that column. See:Or properly normalize the design.
More fundamentally, deciding nationality based on typical names is a dubious approach. Besides being hardly valid, there can be any number of conflicts, like multiple matches. Currently,
LIMIT 1
simply takes an arbitrary match in this case.If you really need to loop in a PL/pgSQL function (which you don’t in this case), read here: