skip to Main Content

There is a request with which you can calculate the percentiles of the request duration to the endpoint /api/v1/blabla

    POST /filebeat-nginx-*/_search
    {
      "aggs": {
        "hosts": {
          "terms": {
            "field": "host.name",
            "size": 1000
          },
          "aggs": {
            "url": {
              "terms": {
                "field": "nginx.access.url",
                "size": 1000
              },
              "aggs": {
                "time_duration_percentiles": {
                  "percentiles": {
                    "field": "nginx.access.time_duration",
                    "percents": [
                      50,
                      90
                    ],
                    "keyed": true
                  }
                }
              }
            }
          }
        }
      },
      "size": 0,
      "query": {
        "bool": {
          "filter": [
            {
              "bool": {
                "should": [
                  {
                    "prefix": {
                      "nginx.access.url": "/api/v1/blabla" 
                    }
                  }
                ]
              }
            },
            {
              "range": {
                "@timestamp": {
                  "gte": "now-10m",
                  "lte": "now" 
                }
              }
            }
          ]
        }
      }
    }

There is a problem with the fact that some arguments are also passed to this endpoint, for example /api/v1/blabla?Lang=en&type=active, or /api/v1/blabla/?Lang=en&type=istory, etc.
Accordingly, the answer shows the percentiles for each such "separate" endpoint:

    {
      "key" : "/api/v1/blabla?lang=ru",
      "doc_count" : 423,
      "time_duration_percentiles" : {
        "values" : {
          "50.0" : 0.21199999749660492,
          "90.0" : 0.29839999079704277
        }
      }
    },
    {
      "key" : "/api/v1/blabla?lang=en&type=active",
      "doc_count" : 31,
      "time_duration_percentiles" : {
        "values" : {
          "50.0" : 0.21699999272823334,
          "90.0" : 0.2510000020265579
        }
      }
    },
    {
      "key" : "/api/v1/blabla?lang=en",
      "doc_count" : 4,
      "time_duration_percentiles" : {
        "values" : {
          "50.0" : 0.22700000554323196,
          "90.0" : 0.24899999797344208
        }
      }
    }

Please tell me is it possible to somehow aggregate similar endpoints into only one /api/v1/blabla and get the general percentile?

Like this:

    {
      "key" : "/api/v1/blabla",
      "doc_count" : 4,
      "time_duration_percentiles" : {
        "values" : {
          "50.0" : 0.22700000554323196,
          "90.0" : 0.24899999797344208
        }
      }
    }

2

Answers


  1. Chosen as BEST ANSWER

    Thanks for the advice Joe.

    I settled on such a decision:

    "aggs": {
      "uri": {
        "terms": {
          "script": {
            "source": "def uri = /(\/[^\?]+)\?.+/.matcher(doc['nginx.access.url'].value);
               if (uri.matches()) {
                 return uri.group(1)
               } else { 
                return 'no_match'
               }"
             }
          }
       }
    }
    

  2. You could try splitting the nginx.access.url in a script but keep in mind that it’ll probably be slow:

    {
      "aggs": {
        "hosts": {
          "terms": {
            "field": "host.name",
            "size": 1000
          },
          "aggs": {
            "url": {
              "terms": {
                "script": {
                  "source": "/\?/.split(doc['nginx.access.url'].value)[0]"       <--- here
                }, 
                "size": 1000
              },
              "aggs": {
                "time_duration_percentiles": {
                  "percentiles": {
                    "field": "nginx.access.time_duration",
                    "percents": [
                      50,
                      90
                    ],
                    "keyed": true
                  }
                }
              }
            }
          }
        }
      },
      ...
    }
    

    BTW it’s good practice to extract the URI hostname, path, query string etc. before you index your docs. You can do so through the URI parts pipeline and other mechanisms.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search