skip to Main Content

I have one collection that the structure of the record look like this,

{
  _id: ObjectId('66a544dc3a8dff9aaaf65343'),
  user_type: "USER",
  status: CREATE
}

The explanation of each field:

  • _id : ID generate by MongoDB
  • user_type : the record’s user type . Can only be USER, ADMIN, DEV
  • status : the status. Can only be CREATE, ACTIVE, DELETE

I want to run the aggregate to get this output

{
  {
    _id: "ALL",
    stat: {
      "CREATE" : 2,
      "ACTIVE" : 5,
      "DELETE" : 4
    }
    "total": 11
  {
    _id: "USER",
    stat: {
      "CREATE" : 1,
      "ACTIVE" : 2,
      "DELETE" : 0
    },
    "total": 3
  },
  {
    _id: "ADMIN",
    stat: {
      "CREATE" : 0,
      "ACTIVE" : 3,
      "DELETE" : 0
    },
    "total": 3
  },
  {
    _id: "DEV",
    stat: {
      "CREATE" : 1,
      "ACTIVE" : 0,
      "DELETE" : 4
    },
    "total": 5
  }
}

I want to count each occurrence of status for all user_type, compute total number of all status, and create a TOTAL document to count the entire collection

I can easily generate all documents except _id: "ALL" document easily using $group with $project. The aggregate command look like this,

db.getCollection('accounts').aggregate(
  [
    {
      $group: {
        _id: '$user_type',
        CREATE: {
          $sum: {
            $cond: [
              {
                $eq: ['$$ROOT.status', 'CREATE']
              },
              1,
              0
            ]
          }
        },
        ACTIVE: {
          $sum: {
            $cond: [
              {
                $eq: ['$$ROOT.status', 'ACTIVE']
              },
              1,
              0
            ]
          }
        },
        DELETE: {
          $sum: {
            $cond: [
              {
                $eq: ['$$ROOT.status', 'DELETE']
              },
              1,
              0
            ]
          }
        },
      }
    },
    {
      $project: {
        stat: {
          CREATE: '$CREATE',
          ACTIVE: '$ACTIVE',
          DELETE: '$DELETE',
        },
        total: {
          $sum: [
            '$CREATE',
            '$ACTIVE',
            '$DELETE',
          ]
        }
      }
    }
  ],
); 

The problem is the second $group command with _id: null

The problem arise when I want to generate another document. I can put another $group command (using _id: null), but it will delete all others record (the only thing left will be the record which _id: null)

Is there a way to preserve the previous records in pipeline? or is there an alternative solution which is better that this? any comment is welcome.

Thank you very much!

2

Answers


  1. I can think of 2 ways to do it.

    1. use $facet to perform the grouping sum twice. This might involve more data wrangling at later stages, but this should be the most performant and the canonical way to do what you expect.
    db.accounts.aggregate([
      {
        "$facet": {
          "accounts": [
            {
              "$group": {
                "_id": "$user_type",
                CREATE: {
                  $sum: {
                    $cond: [
                      {
                        $eq: [
                          "$status",
                          "CREATE"
                        ]
                      },
                      1,
                      0
                    ]
                  }
                },
                ACTIVE: {
                  $sum: {
                    $cond: [
                      {
                        $eq: [
                          "$status",
                          "ACTIVE"
                        ]
                      },
                      1,
                      0
                    ]
                  }
                },
                DELETE: {
                  $sum: {
                    $cond: [
                      {
                        $eq: [
                          "$status",
                          "DELETE"
                        ]
                      },
                      1,
                      0
                    ]
                  }
                }
              }
            }
          ],
          "all": [
            {
              "$group": {
                "_id": "ALL",
                CREATE: {
                  $sum: {
                    $cond: [
                      {
                        $eq: [
                          "$status",
                          "CREATE"
                        ]
                      },
                      1,
                      0
                    ]
                  }
                },
                ACTIVE: {
                  $sum: {
                    $cond: [
                      {
                        $eq: [
                          "$status",
                          "ACTIVE"
                        ]
                      },
                      1,
                      0
                    ]
                  }
                },
                DELETE: {
                  $sum: {
                    $cond: [
                      {
                        $eq: [
                          "$status",
                          "DELETE"
                        ]
                      },
                      1,
                      0
                    ]
                  }
                }
              }
            }
          ]
        }
      },
      {
        "$project": {
          "docs": {
            "$concatArrays": [
              "$accounts",
              "$all"
            ]
          }
        }
      },
      {
        "$unwind": "$docs"
      },
      {
        "$replaceRoot": {
          "newRoot": "$docs"
        }
      },
      {
        $project: {
          stat: {
            CREATE: "$CREATE",
            ACTIVE: "$ACTIVE",
            DELETE: "$DELETE"
          },
          total: {
            $sum: [
              "$CREATE",
              "$ACTIVE",
              "$DELETE"
            ]
          }
        }
      }
    ])
    

    Mongo Playground


    1. use $$unionWith after your current $group stage to perform the grouping sum again.
    db.accounts.aggregate([
      {
        "$group": {
          "_id": "$user_type",
          CREATE: {
            $sum: {
              $cond: [
                {
                  $eq: [
                    "$status",
                    "CREATE"
                  ]
                },
                1,
                0
              ]
            }
          },
          ACTIVE: {
            $sum: {
              $cond: [
                {
                  $eq: [
                    "$status",
                    "ACTIVE"
                  ]
                },
                1,
                0
              ]
            }
          },
          DELETE: {
            $sum: {
              $cond: [
                {
                  $eq: [
                    "$status",
                    "DELETE"
                  ]
                },
                1,
                0
              ]
            }
          }
        }
      },
      {
        "$unionWith": {
          "coll": "accounts",
          "pipeline": [
            {
              "$group": {
                "_id": "ALL",
                CREATE: {
                  $sum: {
                    $cond: [
                      {
                        $eq: [
                          "$status",
                          "CREATE"
                        ]
                      },
                      1,
                      0
                    ]
                  }
                },
                ACTIVE: {
                  $sum: {
                    $cond: [
                      {
                        $eq: [
                          "$status",
                          "ACTIVE"
                        ]
                      },
                      1,
                      0
                    ]
                  }
                },
                DELETE: {
                  $sum: {
                    $cond: [
                      {
                        $eq: [
                          "$status",
                          "DELETE"
                        ]
                      },
                      1,
                      0
                    ]
                  }
                }
              }
            }
          ]
        }
      },
      {
        $project: {
          stat: {
            CREATE: "$CREATE",
            ACTIVE: "$ACTIVE",
            DELETE: "$DELETE"
          },
          total: {
            $sum: [
              "$CREATE",
              "$ACTIVE",
              "$DELETE"
            ]
          }
        }
      }
    ])
    

    Mongo Playground


    Strictly speaking, both of them are just workarounds and do not actually address the need to reuse previously computed results. Although not likely, they might incur performance issues since they are grouping twice. If the data does not change frequently or you have more tolerance for stale data, you may consider materializing the computed result and perform further grouping on it.

    Login or Signup to reply.
  2. Add the following pipelines to the aggregation.

      {
        $group: {
          _id: "ALL",
    
          CREATE: { $sum: "$stat.CREATE" },
          ACTIVE: { $sum: "$stat.ACTIVE" },
          DELETE: { $sum: "$stat.DELETE" },
          total: { $sum: "$total" },
          groups: { $push: "$$ROOT" },
        },
      },
      {
        $project: {
          stat: {
            CREATE: "$CREATE",
            ACTIVE: "$ACTIVE",
            DELETE: "$DELETE",
          },
          total: 1,
          groups: 1,
        },
      },
    

    In the groups field, you can find the document by $user_type.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search