skip to Main Content

There are names of records in which are mixed several types of SKU that may contains symbols, digits, etc.

Examples:

Name of product 67304-4200-52-21
67304-4200-52 Name of product
67304-4200 Name of product
38927/6437 Name of product
BKK1MBM06-02 Name of product
BKK1MBM06 Name of product

I need to preg_match (PHP) only SKU part with any symbols in any combinations.

So i wrote pattern:

/d+/d+|d+-?d+-?d+-?d+|bbkk.*b/i

It works but not with [BKK*] SKU.

Is it way to combine all this types of SKU together in one pattern?

2

Answers


  1. Use

    d+(?:d+(?:-?d+){3}|/d+)|b[bB][kK][kK][A-Za-z0-9-]*
    

    See regex proof.

    REGEX101 EXPLANATION

    1st Alternative d+(?:d+(?:-?d+){3}|/d+)
    d matches a digit (equivalent to [0-9])
    + matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
    Non-capturing group (?:d+(?:-?d+){3}|/d+)
    1st Alternative d+(?:-?d+){3}
    d matches a digit (equivalent to [0-9])
    + matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
    Non-capturing group (?:-?d+){3}
    {3} matches the previous token exactly 3 times
    - matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
    ? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
    d matches a digit (equivalent to [0-9])
    + matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
    2nd Alternative /d+
    / matches the character / with index 4710 (2F16 or 578) literally (case sensitive)
    d matches a digit (equivalent to [0-9])
    + matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
    2nd Alternative b[bB][kK][kK][A-Za-z0-9-]*
    b assert position at a word boundary: (^w|w$|Ww|wW)
    Match a single character present in the list below [bB]
    bB matches a single character in the list bB (case sensitive)
    Match a single character present in the list below [kK]
    kK matches a single character in the list kK (case sensitive)
    Match a single character present in the list below [kK]
    kK matches a single character in the list kK (case sensitive)
    Match a single character present in the list below [A-Za-z0-9-]
    * matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
    A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
    a-z matches a single character in the range between a (index 97) and z (index 122) (case sensitive)
    0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
    - matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
    
    Login or Signup to reply.
  2. The pattern d+-?d+-?d+-?d+ means that there should be at least 4 digits as all the hyphens are optional, but in the example data the part with the numbers have at least a single hyphen, and consist of 2, 3 or 4 parts.

    You could repeat the part with the digits and hyphen 1 or more times, and instead of using .*b use S*b to match optional non whitespace chars that will backtrack until the last word boundary.

    Note that if you use another delimiter in php than /, you don’t have to escape /

    Using a case insensitive match:

    b(?:d+(?:-d+)+|bkkS*|d+/d+)b
    

    Explanation

    • b A word boundary to prevent a partial word match
    • (?: Non capture group for the alternatives
      • d+(?:-d+)+ Match 1+ digits and repeat 1 or more times matching - and again 1+ digits (or use {1,3} instead of +)
      • | Or
      • bkkS* Match bkk and optional non whitespace characters
      • | Or
      • d+/d+ Match 1+ digits / and 1+ digits
    • ) Close the non capture group
    • b A word boundary

    See a regex101 demo.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search