I want to count duplicate rows such that the function only gives the redundant count not the whole count, for example: when a row appears three times and the count function returns 3, the row is only redundant by 2 extra occurences; therefore, the function should return 2 and then sum up the redundant counts to get the total number of duplicate rows. Below is my code
CREATE OR REPLACE FUNCTION count_duplicates()
RETURNS SETOF INTEGER
AS $BODY$
BEGIN
WITH total AS (
SELECT brand, naira_price, city, state, phone_condition, color, COUNT(*) AS qty
FROM phones
GROUP BY brand, naira_price, city, state, phone_condition, color
HAVING COUNT(*) > 1
)
FOR i IN (a.qty) as dup
LOOP
RETURN QUERY SUM(i-1) FROM total a;
END LOOP;
END;
$BODY$ LANGUAGE plpgsql;
The function returns an error message instead ERROR: syntax error at or near "FOR"
LINE 12: FOR i IN (a.qty) as dup
Help is appreciated.
2
Answers
With the clue from @Laurenz Albe answer -- thanks, I have gotten the solution to my question:
This sums up the duplicate rows in the table and returns a single value of the summation.
You cannot mix SQL and PL/pgSQL in a single statement. But according to your description, you want this query: