skip to Main Content

I have a bunch of values representing weight… Some of them are ambiguous, like combining weight and external dimensions of the package, and some don’t contain the measurement unit. What I want is to take those ambiguous values out.

So I came up with this regex pattern ^[d.kg ]+$ that does that, but misses some cases. So from the list here https://regex101.com/r/3SwCuf/1 it also matches plain numbers (line 3 and 9).

Basically what I want it to match is a decimal number, followed by a white-space (not necessarily) and the measuring unit (either kg or g). Nothing more, nothing less.

So in the list below I’m marking what it should match and what it does now:

153g
124g
92 // Matches but I want it out
11.4 kg
0.2 kg
26.0 kg
26.0 kg
27.6 kg
8.1 // Matches but I want it out
8.1 kg
212g
159 g
65.5g
194 g
1.6 kg
19.37 kg
19.4 kg
120 g
120 g
0.025kg 43 x 43 x 14 mm // It doesn't match, that's good
86 x 86 x 35 mm // It doesn't match, that's good
32.5 x 47 x 62 mm // It doesn't match, that's good

2

Answers


  1. [] matches any character inside it, and if you need a unit at the end, move the unit to the end.

    ^d+(.d+)?s?(g|kg)$
    
        d+      The integer part
        (.d+)?  The fraction part
        s?      One optional white-space
        (g|kg)   The measuring unit
    
    Login or Signup to reply.
  2. You are using ^[d.kg ]+$ where [d.kg ]+ is a character class that is repeated 1 or more times.

    You might use:

    ^d+(?:.d+)?h*k?g$
    

    The pattern matches:

    • ^ Start of string
    • d+ Match 1+ digits
    • (?:.d+)? An optional part matching a dot and 1+ digits
    • h* Match optional horizontal whitespace chars
    • k?g Match an optional k and then g
    • $ End of string

    Regex demo

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search