Sum Aggregation Timeout

Hi,

I want to do a SUM Aggregation of bytes per IP from my traffic logs.
There are 80 million documents in one index with 3 shards and 1 replica distributed to three data-nodes.
Only the top ten results are shown in the visualization. The bytes field is mapped as "long" - the ip field mapped as "ip".

Often the visualization exits with a timeout.

I/O wait seems to be ok - CPU is only at about 50%.

What could cause this problem?

When looking at iotop and top I can see reads from the disks and a high CPU load only for the first seconds of the query - shouldn't there be a constant load until the query is finished?

Hello,

Which version of the stack are you on?

@lukeelmers can you please take a look at this?

Thanks,
Bhavya

Hi i am running ES 6.5.3.

For your better understanding - this is the query I run in visualization table:

{
  "aggs": {
    "2": {
      "terms": {
        "field": "source.ip",
        "size": 10,
        "order": {
          "1": "desc"
        }
      },
      "aggs": {
        "1": {
          "sum": {
            "field": "client.bytes"
          }
        }
      }
    }
  },
  "size": 0,
  "_source": {
    "excludes": []
  },
  "stored_fields": [
    "*"
  ],
  "script_fields": {},
  "docvalue_fields": [
    {
      "field": "@timestamp",
      "format": "date_time"
    },
    {
      "field": "event.start",
      "format": "date_time"
    }
  ],
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "source.network.name: Internet AND destination.network.name: Dmz",
            "analyze_wildcard": true,
            "default_field": "*"
          }
        },
        {
          "range": {
            "@timestamp": {
              "gte": 1545228018553,
              "lte": 1545268638554,
              "format": "epoch_millis"
            }
          }
        }
      ],
      "filter": [],
      "should": [],
      "must_not": []
    }
  }
}

Kibana is running on an additional coordinating node - all 4 nodes have 32 GB RAM with 16 GB Heap.

In the dataset of the 80 million ip addresses are about 9 million unique addresses - is that to much for a sum aggregation?

Hi @fpr,

If you run the query from your example above directly against ES, are you experiencing timeout issues? Or just from within Kibana when rendering the visualization?