Optimising Elasticsearch for timeseries data with high search query QPS

viswanathk243 · September 23, 2019, 3:11pm

Hey,

We are using ElasticSearch 2.2.0 with an expected search QPS of 80K with an expected p99 of 180ms.

Current cluster is of 40 nodes with 12 cores, 24 GB memory (12 GB heap size). Data is 150K documents, one shard with 39 replicas. The data is a timeseries data where we periodically update the 150K docs with new values, but the same ids. All the docs have a TTL attached to them with 30 mins being the minimum TTL, and 9 hours being the max TTL.

The issue is that ES starts degrading on the latency as time goes on, so far we figured out that it is because of an increase in the number of segments. Force merging the segments down to 1 will cause improvements, but only for the next 30-40 mins before the no.of segments goes up to 12-15 per node, and ~600 on the cluster.

Is there any configuration that will help with a setup like this? Our refresh interval is set to 30s, and can be increased up to 5 mins.

Christian_Dahlqvist · October 19, 2019, 7:59am

This is a very unusual scenario, but if you are fine serving somewhat stale data, which it seems given that you are willing to increase the refresh interval, you could perhaps try something like this:

Create a master index (basically the current one) that you continously update and set this to have only one replica.
Then create an alias which you will query through. This will point to a special read index that you will periodically create.
Every X minutes you reindex the current content of the master index into a new read index with 0 or 1 replica configured). Once reindexing has completed you forcemerge this down to a single segment and once this has completed you increase the number of replicas to 39. Once each node has a replica you point the read alias to this new index and delete the old read index.

This will allow you to constantly query a forcemerged index while handling updates in the background. The frequency you can refresh with will depend on how long the reindexing and forcemerge process takes. This may also allow you to reduce the refresh interval on the master index.

system · November 16, 2019, 7:59am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Index speed? Elasticsearch	2	719	February 15, 2017
Improving performance of reindex API? Elasticsearch	7	12146	July 5, 2017
Help optimizing performance for several indices Elasticsearch	5	1729	July 5, 2017
Cluster optimization(indexing/query performace) Elasticsearch	4	312	July 6, 2017
Improve reindex speed into new cluster Elasticsearch	4	1090	January 5, 2019

Optimising Elasticsearch for timeseries data with high search query QPS

Related topics