Elasticsearch 7 nodes cluster issues

Hey guys,
I'm making some changes in my elasticsearch cluster and need a little help with the nodes role allocation and the instances configuration.
till now we had 3 nodes working as data and master-eligible together and kibana on another node listening to the cluster.
This setup gave us really hard time and we decided to take elasticsearch one step ahead.
Our Data:
the cluster managed with curator crons that keeps 2 weeks of raw-data (time-based indices) open for debug querying and one more week closed for emergencies.
besides that we have another index that kept update from the raw-data streaming.
The cluster holds ~100GB of data in 15 different indices.
each index divided to 5 shards with 1 replica.

The current setup composed from 7 nodes
3 dedicated master nodes - 2cpu 4gb ram.
3 dedicated data nodes - 4cpu 16gb ram.
1 coordinate node with kibana - 4cpu 8gb ram.

Those are my questions:
1.The instances setup makes sense? reasonable? I read that master nodes can usually be quite "light" compared to data nodes, is 2gb ram instance with 1gb heap sounds good?

2.when i get /_cat/nodes stat i notice that ram.precent is pretty high (above 90 on all nodes):

ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
172.31.17.103 23 99 11 5.88 5.22 5.05 d - DATA1
172.31.12.76 31 94 37 0.30 0.20 0.12 m * MASTER1
172.31.29.43 22 99 7 5.26 5.00 4.98 d - DATA3
172.31.14.236 12 89 0 0.00 0.00 0.00 m - MASTER2
172.31.14.54 21 97 1 0.12 0.04 0.01 - - KIBANA
172.31.14.55 23 93 0 0.00 0.00 0.00 m - MASTER3
172.31.17.46 20 99 9 5.88 5.12 4.97 d - DATA2

this is a proper state? or should i limit this config setting? how can it be done?

3.our system handles with ~150 docs per sec but we want to be able to scale up to 1000 (with data increasing to ~500gb) with the minimal adjustments in the future.
any recommendations that will get us closer to that spot?

4.should we add dedicated coordinate node besides the node with kibana? what the immediate effects of such change? for now it's like we ran the cluster without any coordinate node because the node with kibana running elasticsearch on localhost and communicate with the cluster through the transport client.

  1. how should we talk with the cluster from the web client? provide list of all the nodes ip? only the masters?

  2. does cluster of 2 data nodes with 32ram each sounds better then the current setup?

  3. separate monitoring cluster - mandatory configuration on production env? besides the monitor consistency it takes some of the cluster load?

Thanks so much for your help:)

It sounds to me like you may be having far too many small shards. 15 indices , each with 10 shards per day, gives 2100 shards over a 14 day period. Given the data volumes you have mentioned this seems excessive as each shard has overhead and consumes resources.

The easiest way to reduce the number of shards would be to reduce the number of primary shards per index from 5 to 1. Aim for an average shard size between a few Gb and a few tens of GB. If you are on a recent release of Elasticsearch you can use the shrink index API, but otherwise you may need implement the change and wait for data to be phased out in order to see the effect.

Hey Christian_Dahlqvist, Tnx for your response..
the shards allocation was carried out with a view to future demand to support total data volume of 0.5T - 1T.
at the time, it seems like the right choice for this kind of setup is 5 shards and 1 replica.
I'm going to change the future indices to create with 1 shard and monitor the changes.

Thank you.

The good thing about time-based indices is that you can easily change the number of shards for the next day and therefore adapt as volumes grow. Is that data volume for the entire cluster or the amount of data indexed per day?

yea, i'm going to change the template so that tomorrow index will create with 1 shard and i will share the results with u

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.