skip to Main Content

I have a postgresql table contains a list of email addresses. The table has three columns, Email, EmailServer (e.g., gmail.com, outlook.com, msn.com, and yahoo.com.ca etc.), and Valid (boolean).

Now, I want to group those emails by EmailServer and then update the first 3 records of each large group (count >=6) as Valid = true while leaving the rest of each group as Valid = false.

I failed to get the wanted output by below query:

UPDATE public."EmailContacts"   
SET "Valid"=true
WHERE "EmailServer" IN (
    SELECT "EmailServer"
    FROM public."EmailContacts"
    GROUP by "EmailServer"
    HAVING count(*) >=6
    LIMIT 5)

Please help to modify so as to get the expected results. Would be greatly appreciated for any kind of your help!

2

Answers


  1. You need to get the servers, then order the mails from which one and then perform the update. Something like this:

    WITH DataSourceServers AS 
    (
        SELECT "EmailServer"
        FROM public."EmailContacts"
        GROUP by "EmailServer"
        HAVING count(*) >=6
    ),DataSourceEmails AS
    (
        SELECT "Email", row_number() OVER (PARTITION BY "EmailServer" ORDER BY "Email") AS rn
        FROM public."EmailContacts"
        WHERE "EmailServer" IN (SELECT "EmailServer" FROM DataSourceServers)
    )
    UPDATE public."EmailContacts"
    SET "Valid" = true
    FROM  public."EmailContacts" E
    INNER JOIN DataSourceEmails SE
    WHERE E."EmailServer" = SE."EmailServer"
        AND E."Email" = SE."Email"
        AND SE.rn <= 3;
    
    Login or Signup to reply.
  2. WITH major_servers AS (
        SELECT email_server
        FROM email_address
        GROUP by email_server
        HAVING count(*) >=6
    ),
    enumerated_emails AS (
        SELECT email,
               email_server,
               row_number() OVER (PARTITION BY email_server ORDER BY email) AS row_number --TODO:: ORDER BY email - attention
        FROM email_address
        WHERE email_server IN (SELECT email_server FROM major_servers)
    )
    UPDATE email_address
    SET valid = true
    WHERE email IN (SELECT email
                    FROM enumerated_emails ee
                    WHERE ee.row_number <= 3);
    
    • The first query major_servers finds major groups where more than 5 email servers exist.
    • The second query enumerated_emails enumerates emails by their natural order (see a TODO comment, I think you should choose another ORDER BY criteria) which belong to major groups using window function row_number().
    • The last query updates the first 3 rows in each major server group.

    Find the sql-fiddle here.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search