Slow percentile aggregation: Is it unusual or normal?

Hi All,

I have a dashboard about Netflow data. The dashboard has 10 visualization of different types. 9 visualizations took under 2s to load while 1 percentile aggregation took 21s which is too slow for users.

More info about my ES cluster is at: How can I speed up Kibana aggregation?

Is such slow percentiles aggregation is normal?

Same amount of documents took a very short time under average aggregation

Request body of the percentile aggregation

{
  "size": 0,
  "query": {
    "filtered": {
      "query": {
        "query_string": {
          "query": "*",
          "analyze_wildcard": true
        }
      },
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "@timestamp": {
                  "gte": 1460401200000,
                  "lte": 1460412000000,
                  "format": "epoch_millis"
                }
              }
            }
          ],
          "must_not": []
        }
      }
    }
  },
  "aggs": {
    "2": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "5m",
        "time_zone": "America/Los_Angeles",
        "min_doc_count": 1,
        "extended_bounds": {
          "min": 1460401200000,
          "max": 1460412000000
        }
      },
      "aggs": {
        "1": {
          "percentiles": {
            "field": "time-taken",
            "percents": [
              95,
              99,
              99.9
            ]
          }
        }
      }
    }
  }
}

Request body of the average aggregation

{
  "size": 0,
  "query": {
    "filtered": {
      "query": {
        "query_string": {
          "analyze_wildcard": true,
          "query": "*"
        }
      },
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "@timestamp": {
                  "gte": 1460401200000,
                  "lte": 1460412000000,
                  "format": "epoch_millis"
                }
              }
            }
          ],
          "must_not": []
        }
      }
    }
  },
  "aggs": {
    "2": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "5m",
        "time_zone": "America/Los_Angeles",
        "min_doc_count": 1,
        "extended_bounds": {
          "min": 1460401200000,
          "max": 1460412000000
        }
      },
      "aggs": {
        "1": {
          "avg": {
            "field": "time-taken"
          }
        }
      }
    }
  }
}

Thanks,

Percentiles are indeed much more costly than average or stats aggregations. Since the field that you are aggregating is a response time, you should be able to use the HDR Histogram alternative implementation, which is expected to perform better: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-percentile-aggregation.html#_hdr_histogram (the end of the section explains why we are not making it the default).

Thanks, I'll try HDR