How to speed up date_range aggregation with multiple ranges


(Ha Truong) #1

We are using date_range aggregation where we have 30 date ranges in query, and we also need to do aggregation in each range, please see detail of query as below.
This is taking about 6sec against data set where we have only about 3M docs per range, about 20K uniq value of field2 (which is used in cardinality) and single value in all docs for field1. CPU and Memory usage on ES is very low (<5%).

Is there a way we can bring this query down to less than 1 sec? (e.g not sure if it helps if we break the search to multiple search with smaller number of ranges and run msearch or es already executes different date ranges in the query in parallel? any tuning in query we can do?)

(Note: if we remove cardinality aggregation, the query took about 900ms).

  "size": 0,
  "aggregations": {
    "rolling_date_range": {
      "date_range": {
        "field": "modified_at",
        "ranges": [
          {
            "from": "1527811200000",
            "to": "1530403200000"
          },
          ... There are 30 date ranges here
          {
            "from": "1530316800000",
            "to": "1532908800000"
          }
        ],
        "keyed": false
      },
      "aggregations": {
        "field1": {
          "terms": {
            "field": "field1",
            "size": 30,
            "min_doc_count": 1,
            "shard_min_doc_count": 0,
            "show_term_doc_count_error": false,
            "order": [
              {
                "_count": "desc"
              },
              {
                "_key": "asc"
              }
            ]
          },
          "aggregations": {
            "count_distinct_field2": {
              "cardinality": {
                "field": "field2"
              }
            }
          }
        }
      }
    }
  }
}