skip to Main Content

Is there a good strategy for cleaning up user inputted Gmail addresses of the form

[email protected]
[email protected]

To be the actual address? ie

[email protected]

The use case is to disallow creating multiple website accounts that have distinct gmail addresses yet they all point to the same gmail inbox. The "normalized" email would be stored in a separate field in the database, so that when any new user signs up we can easily check the normalized new user email address vs the normalized existing emails.

Here’s what I came up with and a code example:

  1. Delete all dots . that occur before @
  2. Delete all plus + and everything following up to @
  3. Delete the oogle out of @googlemail.com

These 3 match operations or’ed together in this regex

/.+(?=.*@(gmail|googlemail).com)|+.*(?=@(gmail|googlemail).com)|(?<=@g)oogle(?=mail.com)/gi

It works on the test cases below, it’s not very polished. Is there another technology that is more effective?

const teststr = `[email protected]
[email protected]
[email protected]
[email protected]`;

const tests = teststr.split("n");

const re = /.+(?=.*@(gmail|googlemail).com)|+.*(?=@(gmail|googlemail).com)|(?<=@g)oogle(?=mail.com)/gi;

const results = tests.map(t => t.replace(re, ""));
console.log(results);

2

Answers


  1. Simpler is to first split the string at the @ character. Then clean up the first part, and put them back together.

    function clean_name(address) {
      let [localpart, domain] = address.split('@');
      if (domain.match(/^(gmail|googlemail).com$/i)) {
        localpart = localpart.replace(/+.*|./g, '');
      }
      return localpart + '@' + domain;
    }
      
    
    const teststr = `[email protected]
    [email protected]
    [email protected]
    [email protected]`;
    
    const tests = teststr.split("n");
    
    const results = tests.map(clean_name);
    console.log(results);
    Login or Signup to reply.
  2. I made the solution into the function parseEmails(emails):

    const teststr = `[email protected]
    [email protected]
    [email protected]
    [email protected]`;
    
    const validGmailTest = /^[a-z0-9]+(?!.*(?:+{2,}|-{2,}|.{2,}))(?:[.+-]{0,1}[a-z0-9])*@gmail.com$/gmi;
    const parseGmailTest = /.+(?=.*@(gmail|googlemail).com)|+.*(?=@(gmail|googlemail).com)|(?<=@g)oogle(?=mail.com)/gi;
    
    function parseEmails ( emails )
    {
      if ( ! Array.isArray(emails) ) return parseEmails(String(emails).split("n"));
      
      const copy = [ ...emails ];
      
      for ( const email in emails )
      {
        // Parse the email if it's invalid gmail.
        if ( emails[email].match(parseGmailTest) )
        {
          const [ user, domain ] = copy[email].split('@');
          copy[email] =
          (
            (
              user
                .replaceAll('.', '')
                .replaceAll('+', '')
              //.replaceAll('-', '')
            )
            + '@' +
            ( domain.replace('googlem', 'gm') )
          );
        }
        
        // Delete if is still invalid gmail.
        // Can replace with another test.
        if ( ! copy[email].match(validGmailTest) )
        {
          delete copy[email];
        }
      }
      
      return copy;
    }
    
    ///// TESTS \\
    
    console.log('teststr =');
    console.log(teststr);
    
    console.log('parseEmails(teststr) =');
    console.log(parseEmails(teststr));
    .as-console-wrapper
    {
      min-height: 100%;
    }

    Note that I’m using your Regular Expression. There’s a much better one for testing valid gmail adresses that I’m also using as an optional final validator:

    /^[a-z0-9]+(?!.*(?:+{2,}|-{2,}|.{2,}))(?:[.+-]{0,1}[a-z0-9])*@(gmail|googlemail).com$/gmi
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search