I have highly unstructured dynamic data where I have a field that is a multidimensional string array. The exact size of the outer or inner arrays is not known beforehand and can vary. I would like to find all documents where any/all of the nested array elements evaluate on a condition I have. So for the example document below I only want to match it if ANY of the arrays has elements that are ALL have a string length greater than 3 – which would be true in this case.
{
"_id": "00000000-0000-0007-0000-000000000001",
"stringArrays": [
[
"Hello",
"World",
"!"
],
[
"Fred",
"Todd"
]
]
}
The first array contains an element that is less than 3 the "!", but the second array with names matches both values of string length > 3. So overall the condition is met.
So far I have tried to find info about how to do this but all answers were either using fixed array indices which I can’t or hard regex conditions which works better but will not work in all my cases. Sometimes I need to also query element.Length $IN [1,3,5,..] so exact string length values and not just $LT or $EQ or $GT.
My only lead is maybe using https://www.mongodb.com/docs/manual/reference/operator/aggregation/strLenCP/#example somehow to project additional string length data to query later, but I would not know how to create a nested stringArraysLengths
that maybe have the same structure but instead of values it has the string lengths I can query instead.
Maybe there are better ways to do this. I am relatively new to Mongodb and would appreciate some more experience users giving me a few hints to find the right way to do this. Thank you!
2
Answers
Here’s one way you could do it with nested
"$reduce"
operations.Try it on mongoplayground.net.
This would be my approach:
Mongo Playground
Or use
$filter
to remove all short strings and check if result is not empty.