skip to Main Content

When I group and add (read merge) objects based on a value I would like to have certain attributes be concatenated together instead of the default "object on the right wins" behavior.

demo

Put another way I would like the addition/merge of these two objects

[
  {
    "location": "apc1",
    "type": "app",
    "fgrp": "cert-apc-1"
  },
  {
    "location": "apc1",
    "type": "ctl",
    "cgrp": "ctl-apc1"
  }
]

to produce this

[
  {
    "location": "apc1",
    "type": "app,ctl",
    "fgrp": "cert-apc-1",
    "cgrp": "ctl-apc1"
  }
]

Attempted group_by(.location) | map(add) but as noted it basically just keeps the last value. Also looked at transpose but was unsure it would meet the requirement. Also, I’m not married to a comma as a delimiter if something else is easier.

2

Answers


  1. You could write your custom merge operation. For instance, iterate over all items, then over each item’s fields, and collect all values in an array under the field’s name under the according .location value. Finally, repair the .location arrays by replacing them with only their first item:

    reduce .[] as $item ({};
      reduce ($item | to_entries[]) as $ent (.;
        .[$item.location][$ent.key] += [$ent.value]
      )
    ) | map(.location |= first)
    
    [
      {
        "location": "apc1",
        "type": [
          "app",
          "ctl"
        ],
        "fgrp": [
          "cert-apc-1"
        ],
        "cgrp": [
          "ctl-apc1"
        ]
      }
    ]
    

    Demo

    Going further, you could concatenate the array values using join with a glue string (here ","). Before that, prep the reparation of .location to match the type requirements of join (which only accepts arrays):

    reduce .[] as $item ({};
      reduce ($item | to_entries[]) as $ent (.;
        .[$item.location][$ent.key] += [$ent.value]
      )
    ) | map(.location |= [first] | .[] |= join(","))
    
    [
      {
        "location": "apc1",
        "type": "app,ctl",
        "fgrp": "cert-apc-1",
        "cgrp": "ctl-apc1"
      }
    ]
    

    Demo

    For convenience, you can now wrap this filter into a function definition with its dynamic parts (the index and join expressions) parametrized:

    def merge(idx_expr; join_expr):
      reduce .[] as $item ({};
        reduce ($item | to_entries[]) as $ent (.;
          .[$item | idx_expr][$ent.key] += [$ent.value]
        )
      ) | map(join_expr);
    
    merge(.location; .location |= [first] | .[] |= join(","))
    
    [
      {
        "location": "apc1",
        "type": "app,ctl",
        "fgrp": "cert-apc-1",
        "cgrp": "ctl-apc1"
      }
    ]
    

    Demo

    Or give your function a more general-purpose character by excluding the final mapping from it, so it produces an object (not an array) with the indices as keys, thus acting more like an array-valued variant of group_by (also making no distinction in processing regarding the indexed field, which after all could be any index expression, not just the value of one common field):

    def merge_by(f):
      reduce .[] as $item ({};
        reduce ($item | to_entries[]) as $ent (.;
          .[$item|f][$ent.key] += [$ent.value]
        )
      );
    
    merge_by(.location)
    
    {
      "apc1": {
        "location": [
          "apc1",
          "apc1"
        ],
        "type": [
          "app",
          "ctl"
        ],
        "fgrp": [
          "cert-apc-1"
        ],
        "cgrp": [
          "ctl-apc1"
        ]
      }
    }
    

    Demo

    Login or Signup to reply.
  2. Attempted group_by(.location) | map(add) but as noted it basically just keeps the last value.

    Following this more functional group_by-based approach, you could replace add with a custom function that first decomposes its input array (of grouped objects) into a stream of path-value pairs, then flips the first two positions in every path array (from [position, field name] to [field name, position]), and reomposes it again. The result is a single object (the field names moved to first position) with its items containing maps of that field’s values across the group (with null (or a shorter array) at positions where the corresponing item did not have that field). Thus, a subsequent join would also require the arrays to be reduced to only contain non-null values:

    def merge: fromstream(tostream | first[:2] |= reverse)
      | .location |= .[:1] | .[] |= (map(values) | join(","));
    
    group_by(.location) | map(merge)
    
    [
      {
        "location": "apc1",
        "type": "app,ctl",
        "fgrp": "cert-apc-1",
        "cgrp": "ctl-apc1"
      }
    ]
    

    Demo

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search