Slow Aggregation over Time based indices. Each index has around ~10M records

I am using a monthly time based indexes ( Eg - prefix-01-2019, prefix-02-2019 .. so on ) for storing the time based data. Each index has ~10 Million documents with 13 fields in each document. I am trying to perform nested aggregation queries across multiple indexes. But the query time is ~10-15 Seconds, which is quite slow.

Here is the link of the query - https://gist.github.com/durgaharish1993/8af7256f02c44988b3f1e09ada370323

  1. date_value - Field name of date
  2. dimension1 - It is a text field with cardinality ~ 5000
  3. dimension2 - It is a text field with cardinality = 6
  4. metric1 - It is a floating point number

Please let me know if there are ways to improve the query time.

Elasticsearch cluster details
4 Data node cluster

That sounds very slow given the amount of data. Given that you are querying across 2 months worth of data I would expect only 2 monthly indices to be involved. If you have 1 primary shard per index (which seems appropriate given the number of documents per index) 2 shard will be queried to serve the request. As queries against each shard is single threaded it means that only 2 threads across your 4 nodes will be busy with the query. If you also have slow storage this could perhaps explain the slow response times.

Hi Christian, Thanks for the reply. You are correct, I ran multiple tests based on 2 month of data - Just to correct, the response times are ~10s to 15s.
I am using 3 primary shards per index. Do you think reducing the number of primary shards to 1 will speed up the response time?.

Any kind of help would be great.

No, quite the opposite. I would expect that to give worse performance.

What type of hardware is your cluster deployed on? What kind of storage are you using? What is the load on the cluster when you are querying?

Thanks for confirming . We are using 4 node cluster on digitalocean's Standard droplet with following configuration

32 GB RAM - 8CPU's X 1 node - Acts as master
8 GB RAM - 4 CPU's X 3 node - Acts as data nodes

Storage - We are using SSD Storage.
There is no other activity in the cluster when we were testing/running these queries. All the 4 nodes are dedicated to Elasticsearch job.

You should always look to have 3 master eligible nodes in a cluster as having a single master eligible node is very bad and can lead to data loss. As dedicated master nodes should not serve requests they can generally be smaller than the data nodes, which do all the hard work.

I am not familiar with digitalocean's hosts, but if you have networked SSD storage this can still be the bottleneck. I would recommend looking at disk utilization and iowait, e.g. using iostat or similar tools.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.