skip to Main Content

I’m trying to find all unique strings of some subdocument in MongoDB, as long as at least one of them contains a certain substring.
These strings are localized though.
I want to have the results for all possible languages.

So, my document would look something like this:

  "myLocalizedField": {
    "en": ...,
    "de": ...,
    "cz": ...

I do not know which languages are being used in the DB and not all documents use the same languages.

This is what my current pipeline looks like:

        "I DON'T KNOW": {
          $regex: "my substring",
        myLocalizedField: 1,
        path: "$myLocalizedField",
        _id: {
          key: "I DON'T KNOW",

I am unsure what to put in the $match and in the $group stages…



  1. $objectToArray to rescue:

        $project: { localized: { $objectToArray: "$myLocalizedField" } }
            path: "$localized",
            "localized.v": {
              $regex: "my substring",
            _id: {
              key: "$localized.k",
    Login or Signup to reply.
  2. You can use the $objectToArray operator:


    {"en":"foo", "de":"bar", "cz":"buz"} 
    [{"k":"en", "v":"foo"}, {"k":"de", "v":"bar"}, {"k":"cz", "v":"buz"}]

        "$addFields": {
          "tmp": {
            "$objectToArray": "$myLocalizedField"
        $match: {
          "tmp.v": {
            $regex: "llo"
        $unwind: {
          path: "$tmp"
        $match: {
          "tmp.v": {
            $regex: "llo"


    Note: This solution doesn’t scale, since we reshape the document before searching.

    I recommend you to store your documents this way:

      "myLocalizedField": [
        {"k": "en", "v": "Hello world"},
        {"k": "de", "v": "Hallo Welt"},
        {"k": "cz", "v": "Ahoj světe"},
        {"k": "es", "v": "Hola Mundo"},

    In this way the search is scalable and it is easier to apply filters.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top