skip to Main Content

I want to count duplicate rows such that the function only gives the redundant count not the whole count, for example: when a row appears three times and the count function returns 3, the row is only redundant by 2 extra occurences; therefore, the function should return 2 and then sum up the redundant counts to get the total number of duplicate rows. Below is my code

 CREATE OR REPLACE FUNCTION count_duplicates()
 RETURNS SETOF INTEGER
 AS $BODY$
 BEGIN
     WITH total AS (
         SELECT brand, naira_price, city, state, phone_condition, color, COUNT(*) AS qty
         FROM phones
         GROUP BY brand, naira_price, city, state, phone_condition, color
         HAVING COUNT(*) > 1
     )

     FOR i IN (a.qty) as dup
     LOOP
         RETURN QUERY SUM(i-1) FROM total a;
     END LOOP;
 END;
 $BODY$ LANGUAGE plpgsql;

The function returns an error message instead ERROR: syntax error at or near "FOR"
LINE 12: FOR i IN (a.qty) as dup Help is appreciated.

2

Answers


  1. Chosen as BEST ANSWER

    With the clue from @Laurenz Albe answer -- thanks, I have gotten the solution to my question:

    WITH total AS (
        SELECT brand, naira_price, city, state, phone_condition, color,
            (COUNT(*) - 1) AS qty
        FROM phones
        GROUP BY brand, naira_price, city, state, phone_condition, color
        HAVING COUNT(*) > 1
    )
    
    SELECT SUM(a.qty) AS duplicate_rows FROM total a;
    

    This sums up the duplicate rows in the table and returns a single value of the summation.


  2. You cannot mix SQL and PL/pgSQL in a single statement. But according to your description, you want this query:

    SELECT brand, naira_price, city, state, phone_condition, color,
           COUNT(*) - 1 AS superfluous
    FROM phones
    GROUP BY brand, naira_price, city, state, phone_condition, color
    HAVING COUNT(*) > 1;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search