skip to Main Content

I am facing a problem in MongoDB. Suppose, I have the following collection.

{ id: 1, issueDate: "07/05/2021", code: "31" },
{ id: 2, issueDate: "12/11/2020", code: "14" },
{ id: 3, issueDate: "02/11/2021", code: "98" },
{ id: 4, issueDate: "01/02/2021", code: "14" },
{ id: 5, issueDate: "06/23/2020", code: "14" },
{ id: 6, issueDate: "07/01/2020", code: "31" },
{ id: 7, issueDate: "07/05/2022", code: "14" },
{ id: 8, issueDate: "07/02/2022", code: "20" },
{ id: 9, issueDate: "07/02/2022", code: "14" }

The date field is in the format MM/DD/YYYY. My goal is to get the count of items with each season (spring (March-May), summer (June-August), autumn (September-November) and winter (December-February).

The result I’m expecting is:

  1. count of fields for each season:

    { "_id" : "Summer", "count" : 6 }
    { "_id" : "Winter", "count" : 3 }

  2. top 2 codes (first and second most recurring) per season:

    { "_id" : "Summer", "codes" : {14, 31} }
    { "_id" : "Winter", "codes" : {14, 98} }

How can this be done?

3

Answers


  1. I will give you clues,

    1. You need to use $group with _id as $month on issueDate, use accumulator $sum to get month wise count.
    2. You can divide month by 3, to get modulo, using $toInt, $divide, then put them into category using $cond.
    Login or Signup to reply.
  2. You should never store date/time values as string, store always proper Date objects.

    You can use $setWindowFields opedrator for that:

    db.collection.aggregate([
       // Convert string into Date
       { $set: { issueDate: { $dateFromString: { dateString: "$issueDate", format: "%m/%d/%Y" } } } },
       // Determine the season (0..3)
       {
          $set: {
             season: { $mod: [{ $toInt: { $divide: [{ $add: [{ $subtract: [{ $month: "$issueDate" }, 1] }, 1] }, 3] } }, 4] }
          }
       },
       // Count codes per season
       {
          $group: {
             _id: { season: "$season", code: "$code" },
             count: { $count: {} },
          }
       },
       // Rank occurrence of codes per season
       {
          $setWindowFields: {
             partitionBy: "$_id.season",
             sortBy: { count: -1 },
             output: {
                rank: { $denseRank: {} },
                count: { $sum: "$count" }
             }
          }
       },
       // Get only top 2 ranks
       { $match: { rank: { $lte: 2 } } },
       // Final grouping
       {
          $group: {
             _id: "$_id.season",
             count: { $first: "$count" },
             codes: { $push: "$_id.code" }
          }
       },
       // Some cosmetic for output
       {
          $set: {
             season: {
                $switch: {
                   branches: [
                      { case: { $eq: ["$_id", 0] }, then: 'Winter' },
                      { case: { $eq: ["$_id", 1] }, then: 'Spring' },
                      { case: { $eq: ["$_id", 2] }, then: 'Summer' },
                      { case: { $eq: ["$_id", 3] }, then: 'Autumn' },
                   ]
                }
             }
          }
       }
    ])
    

    Mongo Playground

    Login or Signup to reply.
  3. Another option:

    db.collection.aggregate([
    {
     $addFields: {
      "season": {
        $switch: {
          branches: [
            {
              case: {
                $in: [
                  {
                    $substr: [
                      "$issueDate",
                      0,
                      2
                    ]
                  },
                  [
                    "06",
                    "07",
                    "08"
                  ]
                ]
              },
              then: "Summer"
            },
            {
              case: {
                $in: [
                  {
                    $substr: [
                      "$issueDate",
                      0,
                      2
                    ]
                  },
                  [
                    "03",
                    "04",
                    "05"
                  ]
                ]
              },
              then: "Spring"
            },
            {
              case: {
                $in: [
                  {
                    $substr: [
                      "$issueDate",
                      0,
                      2
                    ]
                  },
                  [
                    "12",
                    "01",
                    "02"
                  ]
                ]
              },
              then: "Winter"
            }
          ],
          default: "No date found."
        }
       }
      }
     },
    {
     $group: {
      _id: {
        s: "$season",
        c: "$code"
      },
      cnt1: {
        $sum: 1
       }
      }
     },
     {
      $sort: {
        cnt1: -1
     }
    },
    {
     $group: {
      _id: "$_id.s",
      codes: {
        $push: "$_id.c"
      },
      cnt: {
        $sum: "$cnt1"
        }
      }
     },
     {
      $project: {
      _id: 0,
      season: "$_id",
      count: "$cnt",
      codes: {
        "$slice": [
          "$codes",
          2
         ]
        }
      }
     }
    ])
    

    Explained:

    1. Add one more field for season based on $switch per month(extracted from issueDate string)
    2. Group to collect per season/code.
    3. $sort per code DESCENDING
    4. group per season to form an array with most recurring codes in descending order.
    5. Project the fields to the desired output and $slice the codes to limit only to the fist two most recurring.

    Comment:
    Indeed keeping dates in string is not a good idea in general …

    Playground

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search