Here is the mappings of my index PublicationsLikes:
- id : String
- account : String
- api : String
- date : Date
I’m currently making an aggregation on ES where I group the results counts by the id (of the publication).
{
"key": "<publicationId-1>",
"doc_count": 25
},
{
"key": "<publicationId-2>",
"doc_count": 387
},
{
"key": "<publicationId-3>",
"doc_count": 7831
}
The returned “key” (the id) is an information but I also need to select another fields of the publication like account and api. A bit like that:
{
"key": "<publicationId-1>",
"api": "Facebook",
"accountId": "65465z4fe6ezf456ezdf",
"doc_count": 25
},
{
"key": "<publicationId-2>",
"api": "Twitter",
"accountId": "afaez5f4eaz",
"doc_count": 387
}
How can I manage this?
Thanks.
3
Answers
Thanks both for your quick replies. I think the first solution is the most "beautiful" (in terms of request but also to retrieves the results) but both seems to be sub aggregations queries.
{ "size": 0, "aggs": { "publications": { "terms": { "size": 0, "field": "publicationId" }, "aggs": { "sample": { "top_hits": { "size": 1, "_source": ["accountId", "api"] } } } } } }
I think I must be careful to size=0 parameter, so, because I work in the Java Api, I decided to put INT.Max instead of 0.
Thnaks a lot guys.
You can use subaggregation for this.
Your result will not exactly what you want but it will be a bit similar:
You can also check the link to find more information
This requirement is best achieved by
top_hits
aggregation, where you can sort the documents in each bucket and choose the first and also you can control which fields you want returned: