I built a regex to capture the value from the pattern, where pattern is to identify the json and fetch value from it. But along with the expected groups, it is also capturing the empty strings in the group.
Regex:
(?<=((?i)(finInstKey)":)["]?)(.*?)(?=["|,|}])|(?<="((?i)finInstKey","value":)["]?)(.*?)(?=["|,|}])
input:
- {"finInstKey":500},{"name":"finInstKey","value":12345678900987654321}
- {finInstKey":"500"},{"name":"finInstKey","value":"12345678900987654321"}
for these inputs, input 2 also captures the empty string along with the expected values.
actual output:
500
12345678900987654321
500
12345678900987654321
expected output:
500
12345678900987654321
500
12345678900987654321
As of now, I have handled it manually in the Java code, but it would be nice if regex won’t capture the empty strings.
what changes should I make in the regex to get expected output.
Mainly, I want this to replaceAll groups with masked value "****".
My piece of code:
public class RegexTester {
private static final String regex = "(?<=((?i)(%s)":)["]?)(.*?)(?=["|,|}])|(?<="((?i)%s","value":)["]?)(.*?)(?=["|,|}])";
public static void main(String[] args) {
String field = "finInstKey";
String input = "{"finInstKey":500},{"name":"finInstKey","value":12345678900987654321}{finInstKey":"500"},{"name":"finInstKey","value":"12345678900987654321"}";
try {
Pattern pattern = Pattern.compile(String.format(regex, field, field));
Matcher matcher = pattern.matcher(input);
// System.out.println(matcher.replaceAll("****"));
while (matcher.find()) {
System.out.println(matcher.group());
}
} catch (Exception e) {
System.err.println(e);
}
}
}
4
Answers
The
finInstKey
key is not enclosed in quotes leading to empty matches. By changing the pattern to"finInstKey"
you will allow it to match this input and correctly extract the value.Use it like
here is the code
It’d probably be easier using a JSON parsing library to parse JSON, instead of regex.
Try the
.fromJSON
method from https://github.com/google/gsonIf you insist on using regex, maybe look into the
+
symbol in regex, it means "match one or more". Regex is pretty difficult to read when it gets complicated like you have there.You can use the following pattern. The capture groups are 2 and 3.
It’s not easy to determine the end of the value, considering a text value may contain any of the possible delimiters.
Assure that your data will conform; this implies that it’s just a series of numbers.
Although, I recommend just using a JSON parsing module, Gson by Google works well.
You’re JSON strings are actually arrays, so just place each within square brackets.
Note that your second example has a missing quotation mark for the finInstKey key.
With Gson you can utilize the JsonParser class to parse the values.
Output
I think you use not correct regexp.
Output:
P.S. I think parsing json with regexpis a strategically bad idea. I recommend you to use any Json parser (Jackson, Gson, …)