Optimal configuration of the ES cluster

IliaIsakhin · September 8, 2020, 6:58am

Hello!

My ES 7.8.0 cluster contains 5 nodes with 120 GB heap memory and 32 CPU in it. Also, I have pretty large index with 6.6kk docs and, according to _cat/indices/ API, with store.size 59.8gb and pri.store.size 30.1gb. Number of primary shards is 3 and replicas is 1. Is it optimal number for shards? I read that every shard should contain ~ 30GB of data

In the same time every node of my cluster is master, data and ingest - is it ok, or I must configure it like 3 master-nodes and 2 data-nodes? Currently, I'm ok with search speed and availability of the cluster. In addition to that periodically I'm catching warning in log file about

took [17.6s], which is over [10s], to compute cluster state update for [put-mapping

I think it is strange due to quite good resources that I gave to my cluster.

Thank you!

Christian_Dahlqvist · September 8, 2020, 7:14am

It looks like you either have very large or constantly expanding mappings, which is causing updates to be slow. How many fields does your index have? Have you overridden any of the default settings?

Are you you using parent-child or nested mappings?

IliaIsakhin · September 8, 2020, 8:07am

I have 2 dynamic mappings with path match and no nested or parent-child mappings.
My settings endpoint returns

search": {
      "max_buckets": "100000",
      "default_keep_alive": "1m",
      "max_keep_alive": "5m"
 }

Also my _mappings endpoint returns quite a big json due to dynamic mappings.

Christian_Dahlqvist · September 8, 2020, 8:09am

How many fields do you have in your mappings? Have you overridden the default limit?

IliaIsakhin · September 8, 2020, 8:22am

"index.mapping.total_fields.limit": 200000

Yes, we have this index setting

Christian_Dahlqvist · September 8, 2020, 8:37am

That does not sound like a good setting value and will most likely cause problems. No wonder mapping changes result in slow cluster state updates as these are performed in a single thread.

I can think of no easy fix, so suspect you may either need to live with this or reconsider how you handle mappings to avoid this.

IliaIsakhin · September 8, 2020, 8:46am

Alright, got it, thank you for the answer!
Could you also please give advice about nodes roles and shards sizes for data like in my situation?

Christian_Dahlqvist · September 8, 2020, 9:36am

Your problem seems to be with the mappings and not necessarily with shard size or distribution. Have never seen a use case with even close to that level of mapped fields so have no advice to give. This is unchartered territory as far as I know.

system · October 6, 2020, 9:36am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Configuration for a future cluster with 30b documents Elasticsearch	3	384	August 1, 2019
Newly Setting up Elastic search 1.7 cluster Elasticsearch	3	463	July 5, 2017
Server config for cluster Elasticsearch	2	404	January 12, 2020
Trying to optimize Elasticsearch cluster Elasticsearch	3	963	February 20, 2017
Elasticsearch heap issues Elasticsearch	4	438	July 5, 2017

Optimal configuration of the ES cluster

Related topics