How to tuning aggregation performance


(Alps Wang) #1

Hi there

We have a problem with our ES aggregation query, it took 10-12s to execute.

so here is our cluster information

  1. we have 1 client node, 3 master nodes and 6 data nodes
  2. for data node, its 16 cores. and 64GB memory , we assigned 30gb to the heap
  3. for each index, we have 6 shards and 1 replica.
  4. Index size is around 15GB per day and 110 million records.
  5. maximum of segment is 1.
  6. ES version is 2.2, and doc_value is enabled for all fields.
  7. Query is across all indices , 114 shards, 2 Billion records and index size is 300GB.

here is my query

{
  "size": 0,
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "_cache": true,
                "kName": {
                  "index": "client_index",
                  "type": "client",
                  "id": "123",
                  "path": "cId"
                }
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "dateTerms": {
      "terms": {
        "field": "date"
      },
      "aggs": {
        "searchD": {
          "terms": {
            "field": "prodLog",
            "size": 10
          }
        }
      }
    }
  }
}

partial response

"took": 11570,
"timed_out": false,
"_shards": {
"total": 114,
"successful": 114,
"failed": 0
},
"hits": {
"total": 187772187,
"max_score": 0,
"hits": [ ]

so couple of questions.

  1. since we enabled doc_value, do we still need to assign 30gb to the heap?
  2. when i fire the query , i do see cpu usage reached 90% -100% for couple of seconds. does that mean CPU is the bottleneck?
  3. we have to do lots of terms agg against filed "prodLog" , should we disable doc_value and enable field data cache?
  4. is there any way to make it faster?

any comments are appreciated

Thanks
Alps


(Mark Walkom) #2

You may be oversharding, 6 shards for 15GB is excessive.

What sort of cardinality does that field have?


(system) #3