skip to Main Content

I need a sql regexp pattern that satisfied the following criteria.

1.Accepts chinese characters,
2.Accepts – a-z,A-Z,0-9,spaces
3.Rejects only special characters.

I’ve tried the following.

  Select regexp_like((val)::TEXT , ('^[ !-’~¡-ÿ]*$')::TEXT)


      Or

regexp_like((val)::TEXT ,(‘^[^[:ascii:]]+$’)::text);

The above query also accepts special characters it should not be.

SELECT ((‘#$$’)::TEXT ~ (‘^[a-zA-Z0-9]*$’));

This query satisfied but fails to accept chinese character.

2

Answers


  1. You can use the unicode values of the chinese characters

    SELECT (('#$$')::TEXT ~ ('^[a-zA-Z0-9x4e00-x9fffx3400-x4dbf]*$'));
    
    Login or Signup to reply.
  2. According to Wikipedia, the Chinese characters are within the following Unicode range, U+4E00, through U+9FFFWikipedia – CJK Unified Ideographs.

    So, you can add a Unicode range to your character class, as follows.

    "1.Accepts chinese characters"

    [u4e00-u9fff]
    

    "2.Accepts – a-z,A-Z,0-9,spaces"

    The (?i), will toggle-on case-insensitive mode.

    (?i)[a-zd u4e00-u9fff]
    

    "3.Rejects only special characters."

    I imagine the values you provided, are the characters you wish to reject.
    For the provided range, ! through ’, you want to skip over the digit characters, 0 through 9, and the uppercase letters, A through Z.

    So, that will need to be changed to the following.

    [^!-/:-@[-`~¡-ÿ]
    

    You can then add this character class to the previous, using the character class intersection syntax, &&.

    So, the complete pattern would be the following.

    (?i)^[a-zd u4e00-u9fff&&[^!-/:-@[-`~¡-ÿ]]*$
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search