skip to Main Content

Here is the mappings of my index PublicationsLikes:

  • id : String
  • account : String
  • api : String
  • date : Date

I’m currently making an aggregation on ES where I group the results counts by the id (of the publication).

{
    "key": "<publicationId-1>",
    "doc_count": 25
},
{
    "key": "<publicationId-2>",
    "doc_count": 387
},
{
    "key": "<publicationId-3>",
    "doc_count": 7831
}

The returned “key” (the id) is an information but I also need to select another fields of the publication like account and api. A bit like that:

{
   "key": "<publicationId-1>",
   "api": "Facebook",
   "accountId": "65465z4fe6ezf456ezdf",
   "doc_count": 25
},
{
  "key": "<publicationId-2>",
  "api": "Twitter",
  "accountId": "afaez5f4eaz",
  "doc_count": 387
}

How can I manage this?

Thanks.

3

Answers


  1. Chosen as BEST ANSWER

    Thanks both for your quick replies. I think the first solution is the most "beautiful" (in terms of request but also to retrieves the results) but both seems to be sub aggregations queries.

    { "size": 0, "aggs": { "publications": { "terms": { "size": 0, "field": "publicationId" }, "aggs": { "sample": { "top_hits": { "size": 1, "_source": ["accountId", "api"] } } } } } }

    I think I must be careful to size=0 parameter, so, because I work in the Java Api, I decided to put INT.Max instead of 0.

    Thnaks a lot guys.


  2. You can use subaggregation for this.

    GET /PublicationsLikes/_search
    {
     "aggs" : {
      "ids": {
       "terms": {
        "field": "id"
       },
       "aggs": {
        "accounts": {
         "terms": {
          "field": "account",
          "size": 1
         }
        }
       }
      }
     }
    }
    

    Your result will not exactly what you want but it will be a bit similar:

    {
        "key": "<publicationId-1>",
        "doc_count": 25,
        "accounts": {
          "buckets": [
            {
              "key": "<account-1>",
              "doc_count": 25
            }
          ]
        }
    },
    {
        "key": "<publicationId-2>",
        "doc_count": 387,
        "accounts": {
          "buckets": [
            {
              "key": "<account-2>",
              "doc_count": 387
            }
          ]
        }
    },
    {
        "key": "<publicationId-3>",
        "doc_count": 7831,
        "accounts": {
          "buckets": [
            {
              "key": "<account-3>",
              "doc_count": 7831
            }
          ]
        }
    }
    

    You can also check the link to find more information

    Login or Signup to reply.
  3. This requirement is best achieved by top_hits aggregation, where you can sort the documents in each bucket and choose the first and also you can control which fields you want returned:

    {
      "size": 0,
      "aggs": {
        "publications": {
          "terms": {
            "field": "id"
          },
          "aggs": {
            "sample": {
              "top_hits": {
                "size": 1,
                "_source": ["api","accountId"]
              }
            }
          }
        }
      }
    }
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search