How to plot percentile distributions?

I'm trying to create a latency percentile distribution plot, like the one here: https://hdrhistogram.github.io/HdrHistogram/plotFiles.html

But I cannot find a good way to do this, any ideas?

Details:

  • My source data in ES a number of records for each measurement containing a data_source field and a value field, e.g. {A, 24}, {A, 43}, {B,17}
  • I would like split the series by data_source and calculate percentiles of their values
  • This data I would like to then plot in such a way the percentile, e.g. 99%, is on the X axis and the actual value at that percentile, e.g 43, is on the Y axis
  • I tried the standard visualization (my best effort pasted below), timelion (only supports time-based X axis), and Vega (this has potential I think, but is too complex/undocumented for me to figure out...)

The best I could come up with is pasted below, but this does not work very well, because I cannot set the X axis to consist of percentiles:

{
  "title": "Bench: Parse Total [us]",
  "type": "line",
  "params": {
    "addLegend": true,
    "addTimeMarker": false,
    "addTooltip": true,
    "categoryAxes": [
      {
        "id": "CategoryAxis-1",
        "labels": {
          "show": false,
          "truncate": 100
        },
        "position": "bottom",
        "scale": {
          "type": "linear"
        },
        "show": false,
        "style": {},
        "title": {},
        "type": "category"
      }
    ],
    "grid": {
      "categoryLines": false,
      "style": {
        "color": "#eee"
      },
      "valueAxis": null
    },
    "legendPosition": "right",
    "seriesParams": [
      {
        "data": {
          "id": "1",
          "label": "Average timing_ParseTotalUs"
        },
        "drawLinesBetweenPoints": true,
        "mode": "normal",
        "show": "true",
        "showCircles": false,
        "type": "line",
        "valueAxis": "ValueAxis-1",
        "interpolate": "linear"
      }
    ],
    "times": [],
    "type": "line",
    "valueAxes": [
      {
        "id": "ValueAxis-1",
        "labels": {
          "filter": false,
          "rotate": 0,
          "show": true,
          "truncate": 100
        },
        "name": "LeftAxis-1",
        "position": "left",
        "scale": {
          "mode": "normal",
          "type": "linear"
        },
        "show": true,
        "style": {},
        "title": {
          "text": "Average timing_ParseTotalUs"
        },
        "type": "value"
      }
    ]
  },
  "aggs": [
    {
      "id": "1",
      "enabled": true,
      "type": "avg",
      "schema": "metric",
      "params": {
        "field": "timing_ParseTotalUs"
      }
    },
    {
      "id": "2",
      "enabled": true,
      "type": "terms",
      "schema": "segment",
      "params": {
        "field": "timing_ParseTotalUs",
        "size": 1000,
        "order": "desc",
        "orderBy": "1",
        "otherBucket": false,
        "otherBucketLabel": "Other",
        "missingBucket": false,
        "missingBucketLabel": "Missing"
      }
    },
    {
      "id": "3",
      "enabled": true,
      "type": "terms",
      "schema": "group",
      "params": {
        "field": "logFilePath",
        "size": 100,
        "order": "desc",
        "orderBy": "_key",
        "otherBucket": false,
        "otherBucketLabel": "Other",
        "missingBucket": false,
        "missingBucketLabel": "Missing"
      }
    }
  ]
}

Why doesn't it work with a line chart? can you post a screenshot? (a query like this is really hard to understand from a visualization point of view)

Hi Marius,

Here is a screenshot, as you can see there are 2 issues:

  • The X axis is overcrowded with labels showing the latency values, and there is no good way to adjust this. Ideally, it should be a histogram-like axis showing percentiles like 99%, etc.
  • When you get more series on the diagram, you start losing data on the individual lines, as it seems the number of data points is decided for all the lines together.

Btw. the code pasted above is a view definition copied from Kibana - hopefully it should be possible for you to import the view and reproduce on your local instance.

I managed to get this working with a custom Vega script.

Would be great though to get this officially supported someday, as percentile plots are a de facto standard for benchmark analysis.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.