skip to Main Content

I built a regex to capture the value from the pattern, where pattern is to identify the json and fetch value from it. But along with the expected groups, it is also capturing the empty strings in the group.

Regex:
(?<=((?i)(finInstKey)":)["]?)(.*?)(?=["|,|}])|(?<="((?i)finInstKey","value":)["]?)(.*?)(?=["|,|}])

input:

  1. {"finInstKey":500},{"name":"finInstKey","value":12345678900987654321}
  2. {finInstKey":"500"},{"name":"finInstKey","value":"12345678900987654321"}

for these inputs, input 2 also captures the empty string along with the expected values.

actual output:

500
12345678900987654321

500

12345678900987654321

expected output:

500
12345678900987654321
500
12345678900987654321

As of now, I have handled it manually in the Java code, but it would be nice if regex won’t capture the empty strings.
what changes should I make in the regex to get expected output.

Mainly, I want this to replaceAll groups with masked value "****".

My piece of code:

public class RegexTester {
    private static final String regex = "(?<=((?i)(%s)":)["]?)(.*?)(?=["|,|}])|(?<="((?i)%s","value":)["]?)(.*?)(?=["|,|}])";

    public static void main(String[] args) {
        String field = "finInstKey";
        String input = "{"finInstKey":500},{"name":"finInstKey","value":12345678900987654321}{finInstKey":"500"},{"name":"finInstKey","value":"12345678900987654321"}";
        try {
            Pattern pattern = Pattern.compile(String.format(regex, field, field));
            Matcher matcher = pattern.matcher(input);
//            System.out.println(matcher.replaceAll("****"));
            while (matcher.find()) {
                System.out.println(matcher.group());
            }
        } catch (Exception e) {
            System.err.println(e);
        }

    }

}

4

Answers


  1. The finInstKey key is not enclosed in quotes leading to empty matches. By changing the pattern to "finInstKey" you will allow it to match this input and correctly extract the value.

    Use it like

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class Main {
    
        public static void main(String[] args) {
            String field = "finInstKey";
            String regex = ""?" + field + ""?(\s*:\s*"?([^",}]*)"?|","value"\s*:\s*"?([^",}]*)"?)";
    
            String input = "{"finInstKey":500},{"name":"finInstKey","value":12345678900987654321}{finInstKey:"500"},{"name":"finInstKey","value":"12345678900987654321"}";
    
            Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
            Matcher matcher = pattern.matcher(input);
    
            while (matcher.find()) {
                if (matcher.group(2) != null) {
                    System.out.println(matcher.group(2));
                } else {
                    System.out.println(matcher.group(3));
                }
            }
        }
    }
    

    here is the code

    Login or Signup to reply.
  2. It’d probably be easier using a JSON parsing library to parse JSON, instead of regex.
    Try the .fromJSON method from https://github.com/google/gson

    If you insist on using regex, maybe look into the + symbol in regex, it means "match one or more". Regex is pretty difficult to read when it gets complicated like you have there.

    Login or Signup to reply.
  3. You can use the following pattern.  The capture groups are 2 and 3.
    It’s not easy to determine the end of the value, considering a text value may contain any of the possible delimiters.
    Assure that your data will conform; this implies that it’s just a series of numbers.

    (?si)("finInstKey")s*:s*"?(.+?)b.+?"name"s*:s*1s*,s*"value"s*:s*"?(.+?)b
    

    Although, I recommend just using a JSON parsing module, Gson by Google works well.

    You’re JSON strings are actually arrays, so just place each within square brackets.

    [
      {
        "finInstKey": 500
      },
      {
        "name": "finInstKey",
        "value": 12345678900987654321
      }
    ]
    

    Note that your second example has a missing quotation mark for the finInstKey key.

    [
      {
        "finInstKey": "500"
      },
      {
        "name": "finInstKey",
        "value": "12345678900987654321"
      }
    ]
    

    With Gson you can utilize the JsonParser class to parse the values.

    String stringA = "[n" +
        "  {n" +
        "    "finInstKey": 500n" +
        "  },n" +
        "  {n" +
        "    "name": "finInstKey",n" +
        "    "value": 12345678900987654321n" +
        "  }n" +
        "]";
    String stringB = "[n" +
        "  {n" +
        "    "finInstKey": "500"n" +
        "  },n" +
        "  {n" +
        "    "name": "finInstKey",n" +
        "    "value": "12345678900987654321"n" +
        "  }n" +
        "]";
    
    JsonArray arrayA = JsonParser.parseString(stringA).getAsJsonArray();
    JsonObject objectA1 = arrayA.get(0).getAsJsonObject();
    JsonElement elementA1 = objectA1.get("finInstKey");
    int finInstKeyA = elementA1.getAsInt();
    JsonObject objectA2 = arrayA.get(1).getAsJsonObject();
    JsonElement elementA2 = objectA2.get("value");
    BigInteger valueA = elementA2.getAsBigInteger();
    System.out.println("finInstKeyA = " + finInstKeyA);
    System.out.println("valueA = " + valueA);
    
    JsonArray arrayB = JsonParser.parseString(stringB).getAsJsonArray();
    JsonObject objectB1 = arrayB.get(0).getAsJsonObject();
    JsonElement elementB1 = objectB1.get("finInstKey");
    String finInstKeyB = elementB1.getAsString();
    JsonObject objectB2 = arrayB.get(1).getAsJsonObject();
    JsonElement elementB2 = objectB2.get("value");
    String valueB = elementB2.getAsString();
    System.out.println("finInstKeyB = " + finInstKeyB);
    System.out.println("valueB = " + valueB);
    

    Output

    finInstKeyA = 500
    valueA = 12345678900987654321
    finInstKeyB = 500
    valueB = 12345678900987654321
    
    Login or Signup to reply.
  4. I think you use not correct regexp.

    public static List<String> getData(String str, String field) {
        String regex = "(?:"?" + field + ""?:"?(\d+)"?)|(?:"name":""
                + field + "","value":"?(\d+)"?)";
        Matcher matcher = Pattern.compile(regex).matcher(str);
        List<String> data = new ArrayList<>();
    
        while (matcher.find()) {
            data.add(Optional.ofNullable(matcher.group(1))
                             .orElseGet(() -> matcher.group(2)));
        }
    
        return data;
    }
    

    Output:

    500
    12345678900987654321
    500
    12345678900987654321
    

    P.S. I think parsing json with regexpis a strategically bad idea. I recommend you to use any Json parser (Jackson, Gson, …)

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search