skip to Main Content

In a workbook, I’m displaying a time chart comparing throttle vs non-throttle requests. The query employs a summarize operator, and while it functions properly, it frequently encounters the "too many datapoints (10000)" error. I attempted to use the limit operator, but it did not work (I think its due to the fact that summarize operator is doing two operations and data points exceed for one of them)

In addition, I’ve included a Timerange picker in the workbook to limit the time span, but even a small window on a busy day can cause the error to appear.

I was also thinking about limiting records in another way, but I’m not sure how. The chart must show RPS so cannot change that to RPM.

 Requests
    | project TimeGenerated, ResultType
    | summarize Throttled =  countif(ResultType == "Throttled") , Total_Requests = count() by  bin(TimeGenerated,1s);
  //| limit 9999 <-- This does not work

enter image description here

2

Answers


  1. This is straightforward rendering issue.

    First –
    There are always limitations, whether it is in the graph visualization library, your screen display resolution, the amount of data your web browser can handle without getting stuck, etc.

    Second –
    You gain nothing from staring at a graph like this:

    Overpacked graph

    You need to lower the granularity of the graph.
    Here are a few options.

    P.S.
    I often prefer make-series to summarize, since it deals with absent values.
    Note that this may come with extra resources utilization.

    1. Increase the bin size

    The choice is not between 1s and 1m.
    You have a whole range in between, E.g. bin(TimeGenerated, 15s).

    let Requests = materialize(range i from 1 to 1000000 step 1 | extend TimeGenerated = ago(1d*rand()), ResultType = case(rand() < 0.02, "Throttled", "Other"));
    Requests
        | make-series Throttled = countif(ResultType == "Throttled") , Total_Requests = count() on TimeGenerated step 15s
        | render timechart 
    

    Fiddle

    Increased bin

    2. Use double aggregation

    First aggregate by 1s and then re-aggregate by a larger time span (E.g.10m), displaying the peaks.
    In this way your graph becomes much more sparse, however you are not losing exceptional values in 1s granularity.

    let Requests = materialize(range i from 1 to 1000000 step 1 | extend TimeGenerated = ago(1d*rand()), ResultType = case(rand() < 0.02, "Throttled", "Other"));
    Requests
        | summarize Throttled = countif(ResultType == "Throttled") , Total_Requests = count() by bin(TimeGenerated,1s)
        | make-series max(Throttled), max(Total_Requests) on TimeGenerated step 10m
        | render timechart 
    

    Fiddle

    Double aggregation

    3. Focus on what you’re really after

    Many times, there is an extra step we can take in order to create a visualization that really emphasizes the phenomena we are interesting at.
    In this case it might be the ratio between Throttled & Total_Requests, and then we can use a single line that makes it very easy to understand what’s going on.
    This graph might be used together with the previous graphs, so we won’t lose the whole picture.

    let Requests = materialize(range i from 1 to 1000000 step 1 | extend TimeGenerated = ago(1d*rand()), ResultType = case(rand() < 0.02, "Throttled", "Other"));
    Requests
    | summarize Throttled = countif(ResultType == "Throttled"), Total_Requests = count() by bin(TimeGenerated, 1s)
    | make-series ratio = max(1.0 * Throttled / Total_Requests) on TimeGenerated step 5m
    | render timechart 
    

    Fiddle

    Ratio

    Login or Signup to reply.
  2. This statement confuses me and indicates some other problem with your query:

    In addition, I’ve included a Timerange picker in the workbook to limit the time span, but even a small window on a busy day can cause the error to appear.

    This doesn’t make any sense, as on busy days and on non-busy days, the number of seconds doesn’t change. there’s always 86400 seconds in a day (well, ignoring day light savings), so busy day or not busy day, if you want to display individual seconds in a chart, and you’re binning by second, the max time range you could show with only one series of data is under 3 hours (3 hr * 60 min * 6 sec = 10800 points)

    If you are forced to be requests per second, and you need to actually return individual requests, you’ll have to further limit the time range, its the only way math works. Or, you’ll have to do order by TimeGenerated desc | limit 10000 and keep only the newest data, etc.

    Or, as David says, you’d have to do double aggregation and then have a way to zoom in. (you could create an outer chart that is requests per minute or higher aggregation, and then enable something like time brushing to allow zoom in to higher precision smaller time ranges

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search