skip to Main Content

Below is the JSON I have

[
  {
    "item_attr1": "abc",
    "item_attr2": "123",
    "item_attr3": "123",
    "item_id": "12345",
    "bucket_attr1": 1919,
    "bucket_attr2": "abc",
    "bucket_attr3": 1922,
    "bucket_attr4": "abc",
    "bucket_id_1": "abc",
    "bucket_id_2": "def",
    "bucket_id_3": "ghi",
    "articleattribute1": "abc",
    "articleattribute2": "abc",
    "articleattribute3": "2233",
    "article_id": "123458"
  },
  {
    "item_attr1": "abc",
    "item_attr2": "123",
    "item_attr3": "123",
    "item_id": "543421",
    "bucket_attr1": 1919,
    "bucket_attr2": "abc",
    "bucket_attr3": 1922,
    "bucket_attr4": "abc",
    "bucket_id_1": "abc",
    "bucket_id_2": "mef",
    "articleattribute1": "abc",
    "articleattribute2": "abc",
    "articleattribute3": "2233",
    "article_id": "12345"
  }
]

I need to group by bucket_id_1, bucket_id_2 and bucket_id_3 and then group that result by article_id so that that the output may look like

{
  "buckets": [
    {
      "bucket_id": "abc",
      "bucket_attr1": 1919,
      "bucket_attr2": "abc",
      "bucket_attr3": 1922,
      "bucket_attr4": "abc",
      "articles": [
        {
          "articleattribute1": "abc",
          "articleattribute2": "abc",
          "articleattribute3": "2233",
          "article_id": "12345",
          "items": [
            {
              "item_attr1": "abc",
              "item_attr2": "123",
              "item_attr3": "123",
              "item_id": "543421"
            },
            {
              "item_attr1": "abc",
              "item_attr2": "123",
              "item_attr3": "123",
              "item_id": "XYZ123"
            }
          ]
        },
        {
          "articleattribute1": "abc",
          "articleattribute2": "abc",
          "articleattribute3": "2233",
          "article_id": "123458",
          "items": [
            {
              "item_attr1": "abc",
              "item_attr2": "123",
              "item_attr3": "123",
              "item_id": "12345"
            }
          ]
        }
      ]
    },
    {
      "bucket_id": "def",
      "bucket_attr1": 1919,
      "bucket_attr2": "abc",
      "bucket_attr3": 1922,
      "bucket_attr4": "abc",
      "articles": [
        {
          "articleattribute1": "abc",
          "articleattribute2": "abc",
          "articleattribute3": "2233",
          "article_id": "12345",
          "items": [
            {
              "articleattribute1": "abc",
              "articleattribute2": "abc",
              "articleattribute3": "2233",
              "article_id": "123458",
              "items": [
                {
                  "item_attr1": "abc",
                  "item_attr2": "123",
                  "item_attr3": "123",
                  "item_id": "12345"
                }
              ]
            }
          ]
        }
      ]
    },
    {
      "bucket_id": "ghi",
      "bucket_attr1": 1919,
      "bucket_attr2": "abc",
      "bucket_attr3": 1922,
      "bucket_attr4": "abc",
      "articles": [
        {
          "articleattribute1": "abc",
          "articleattribute2": "abc",
          "articleattribute3": "2233",
          "article_id": "12345",
          "items": [
            {
              "articleattribute1": "abc",
              "articleattribute2": "abc",
              "articleattribute3": "2233",
              "article_id": "123458",
              "items": [
                {
                  "item_attr1": "abc",
                  "item_attr2": "123",
                  "item_attr3": "123",
                  "item_id": "12345"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

I tried to write Jolt transformation and shiftr operations and also used below site for validating my transformation but could not get the result in my expected format

2

Answers


  1. Chosen as BEST ANSWER

    As of now i am only able to group by bucket_id_1. Below is the Spec I have

    [
      { // group by ProposalId values
        "operation": "shift",
        "spec": {
          "*": {
            "*": "@(1,bucket_id_1).&",
            "article_id|articleattribute2|articleattribute3|articleattribute1": {
              "@": "@(2,bucket_id_1).articles[&2].&"
            },
            "item_id|item_attr1|item_attr2|item_attr3": {
              "@": "@(2,bucket_id_1).articles[&2].items[&2].&"
            }
          }
        }
      },
      { // nest all JSON value within Proposal array
        "operation": "shift",
        "spec": {
          "*": "OrderPreps[]"
        }
      },
      {
        "operation": "cardinality",
        "spec": {
          "*": {
            "*": {
              "*": "ONE",
              "articles": "MANY",
              "items": "MANY"
            }
          }
        }
      },
      { // get rid of redundant nulls
        "operation": "modify-overwrite-beta",
        "spec": {
          "*": "=recursivelySquashNulls"
        }
      }
    ]
    

    I have two issues to be resolved:

    1. I would like to split entries in source JSON if bucket_id_2 and bucked_id_3 is present and then use bucket_id in place of bucket_id_1, bucket_id_2 and bucket_id_3 For example if Source JSON is like

      [ { "bucket_id_1": "1", "bucket_id_2": "2", "bucket_id_3": "3", // Other key-values } ]

    Then I would like to Split to

    [
      {
       "bucket_id": "1"
        //other key-values
      },
      {
       "bucket_id": "2"
        //other key-values
      },
      {
       "bucket_id": "3"
        //other key-values
      }
    ]
    

    and finally apply my transformation on this list grouping bucket_id.

    1. I am able to insert articles of each bucket into that bucket, but suppose we have item_ids which are different but article_id value is the same, then we should have one article and inside that two items, but i am getting 2 articles with an item in each of them as follows

      { "OrderPreps" : [ { "articles" : [ { "items" : [ { "item_attr1" : "abc", "item_attr2" : "123", "item_attr3" : "123", "item_id" : "12345" } ], "articleattribute1" : "abc", "articleattribute2" : "abc", "articleattribute3" : "2233", "article_id" : "12345" }, { "items" : [ { "item_attr1" : "abc", "item_attr2" : "123", "item_attr3" : "123", "item_id" : "543421" } ], "articleattribute1" : "abc", "articleattribute2" : "abc", "articleattribute3" : "2233", "article_id" : "12345" } ], "bucket_attr1" : 1919, "bucket_attr2" : "abc", "bucket_attr3" : 1922, "bucket_attr4" : "abc", "bucket_id_1" : "abc", "bucket_id_2" : "def", "bucket_id_3" : "ghi" } ] }


  2. You can use the following transformation :

    [
      {
        "operation": "shift",
        "spec": {
          "*": {
            "*_1": "buckets[0].&(0,1)",
            "bucket_attr*": "buckets[0].&",
            "article*": "buckets[0].articles[0].&",
            "item*": {
              "@": "buckets[0].articles[0].items[&2].&"
            }
          }
        }
      },
      {// pick single elements from each array those have identical repeating of them
        "operation": "cardinality",
        "spec": {
          "*": {
            "*": {
              "*": "ONE",
              "articles": {
                "*": {
                  "*": "ONE"
                }
              }
            }
          }
        }
      },
      {
        "operation": "sort"
      }
    ]
    

    the demo on the site https://jolt-demo.appspot.com/ is :

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search