How to achieve good performance (huge daily indices)

Naseem · January 8, 2020, 4:02pm

Hello,

Our free text search is unusable. We have a 3 master + 4 data/ingester setup.

We have 1 primary shard and 1 replica per index.

Indices are appended by date, and a new one is created every day (typical).

The daily indices are huge (>400GB).

Furthermore, the data plane resources are not being used efficiently. 2 of the data nodes are maxing out their requested (kubernetes) 3 CPU cores and utilizing much more disk space than 2 other data nodes that sit idle and are using less disk space.

What are we doing wrong? Should we increase replicas, primary shard count, both? Change our rollover strategy?

dadoonet · January 8, 2020, 4:12pm

400gb for one single shard is probably too much.

May I suggest you look at the following resources about sizing:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

And https://www.elastic.co/webinars/using-rally-to-get-your-elasticsearch-cluster-size-right

Naseem · January 9, 2020, 3:49pm

Thanks @dadoonet,

So should we reduce our index size? These are daily indices which seem to be recommended as per your slide deck.

Does 1 primary + 1 replica seem right? Or should we increase primaries or replicas?

I understand your slide deck, more primaries increase write performance but more replicas increase read performance, is that right?

dadoonet · January 10, 2020, 12:03am

Correct.

I'd probably try with 400/50 primaries. So 8 primaries.

Naseem · January 10, 2020, 1:12am

Would rolling over at 50GB per index be an alternative? 1 primary + 1 replica with each index being no more than 50GB?

dadoonet · January 10, 2020, 6:04am

We normally advice to keep shard size between 20 to 50 gb. But it depends on your use case. So you need to test it.

Naseem · January 10, 2020, 6:14am

Alright, we will try 8 primaries with 1 replica each, thanks.

system · February 7, 2020, 6:14am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Large shard size Elasticsearch	4	399	December 4, 2021
Elasticsearch performance tuning doubts Elasticsearch	8	965	June 30, 2019
Number of nodes for primary and replicas required for resilience Elasticsearch	5	1514	April 9, 2020
Trying to optimize Elasticsearch cluster Elasticsearch	3	964	February 20, 2017
A few questions optimizing our Elastic Stack Elasticsearch	6	376	July 11, 2018

How to achieve good performance (huge daily indices)

Related topics