skip to Main Content

I have a collection that have a field named "data" that can have any fields, and I have to get all existing fields in all collections in that "data" field or get the documents that have diferents fields in that "data" field.

for example, if I have:

[
    {
        _id: "45454",
        name: "fulano",
        city: "cali",
        data: {
            age: 12,
            lastName: "panguano",
            cars: 0
        }
    },
    {
        _id: "67899",
        name: "juanito",
        city: "cali",
        data: {
            age: 23,
            lastName: "merlano",
            cars: 2
        }
    },
    {
        _id: "67899",
        name: "olito",
        city: "nw",
        data: {
            lastName: "betito",
            cars: 2
        }
    },
    {
        _id: "11223",
        name: "cabrito",
        city: "trujillo",
        data: {
            age: 28,
            cars: 1,
            moto: 3
        }
    },
]

what I would like to get:

["age", "lastName", "cars", "moto"]

or :

documents where the "data" fields vary, regardless of their values.

[
    {
        _id: "45454",
        name: "fulano",
        city: "cali",
        data: {
            age: 12,
            lastName: "panguano",
            cars: 0
        }
    },
    {
        _id: "67899",
        name: "olito",
        city: "nw",
        data: {
            lastName: "betito",
            cars: 2
        }
    },
    {
        _id: "11223",
        name: "cabrito",
        city: "trujillo",
        data: {
            age: 28,
            cars: 1,
            moto: 3
        }
    }
    
]

THE COLLECTION HAVE SO MANY DOCUMENTS CAN BE A PROBLEM IF I USE
FINDALL AND THEN USE A LOOP LIKE FOR (FOR THE RESOURCES)

2

Answers


  1. Here’s a way using javascript once you have an array of all documents in the collection:

    let arr = [
        {
            _id: "45454",
            name: "fulano",
            city: "cali",
            data: {
                age: 12,
                lastName: "panguano",
                cars: 0
            }
        },
        {
            _id: "67899",
            name: "juanito",
            city: "cali",
            data: {
                age: 23,
                lastName: "merlano",
                cars: 2
            }
        },
        {
            _id: "67899",
            name: "olito",
            city: "nw",
            data: {
                lastName: "betito",
                cars: 2
            }
        },
        {
            _id: "11223",
            name: "cabrito",
            city: "trujillo",
            data: {
                age: 28,
                cars: 1,
                moto: 3
            }
        },
    ]
    

    You can use the .map method to get an array of the data objects like so:

    arr = arr.map(obj => obj.data)
    

    This will return

    [
        {
            "age": 12,
            "lastName": "panguano",
            "cars": 0
        },
        {
            "age": 23,
            "lastName": "merlano",
            "cars": 2
        },
        {
            "lastName": "betito",
            "cars": 2
        },
        {
            "age": 28,
            "cars": 1,
            "moto": 3
        }
    ]
    

    Then you can get an array of data object keys by looping through the array of data objects like so:

    let dataKeys = [];
    arr.forEach(obj => {
            dataKeys = [...dataKeys, ...Object.keys(obj)]
        })
    

    This returns an array of non unique keys:

    dataKeys = [
        "age",
        "lastName",
        "cars",
        "age",
        "lastName",
        "cars",
        "lastName",
        "cars",
        "age",
        "cars",
        "moto"
    ]
    

    Then filter out the unique keys using .filter and .findIndex methods:

    let uniqueKeys = dataKeys.filter((elem, index) => dataKeys.findIndex(obj => obj === elem) === index)
    

    And this will give you

    [
        "age",
        "lastName",
        "cars",
        "moto"
    ]
    
    Login or Signup to reply.
  2. Regardless how you execute this (in memory or on the db) this is a very expensive query, with that said I agree doing this in memory is the wrong approach.

    Here’s how to do it using the aggregation pipeline and some standard operators like $map and $objectToArray:

    db.collection.aggregate([
      {
        $project: {
          keys: {
            $map: {
              input: {
                "$objectToArray": "$data"
              },
              in: "$$this.k"
            }
          }
        }
      },
      {
        "$unwind": "$keys"
      },
      {
        $group: {
          _id: "$keys"
        }
      }
    ])
    

    Mongo Playground

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search