We have a 5 nodes production setup (1 coordinator, 1 master, 3 data nodes). We have setup a Kafka sink to dump messages to an ES index in this cluster. The URL for the sink is pointed to the Coordinator node. Below is machine configuration of this cluster,
Elasticsearch assumes all data nodes to be equal, so the fact that you have one large and 2 small will mean that the the smaller ones will work much harder than the larger one and limit performance.
The main point of having a cluster is usually to avoid single points of failure in order to increase stability, availability and resiliency. A production setup with a single dedicated master node is therefore not recommended. Just because you can have specialised node types does not mean that you should. I would recommend that you instead setup a basic 3 node cluster where all nodes are on the same type of hardware and are all master eligible and hold data. Make sure that you set minimum_master_nodes to 2 to avoid split brain scenarios.
How have you arrived at this value? That is a lot of shards to index into, which may lead to bad indexing performance and possibly also bulk rejections.
Thanks @Christian_Dahlqvist for the reply. We shall definitely set up 3 basic nodes as you suggested and see if that solves this problem.
About the shards, we have large number of data streaming into the system each minute. The data could be classified by type which we know before creating an index (There are 260 possible types of data). So we decided to create index with these many shards & provide routing so data of each type will be stored in single shard.
If this approach is to cause problem, we shall setup time-based indices like you had suggested.
It does not sound to me like that is an appropriate use of routing, especially if it leads to that many shards concurrently being written to. Routing does not mean that each routing key gets it's own shard, just that all data related to a specific routing key resides in a single shard within the index.
We have setup 3 basic nodes setup & used default 5 primary & 1 replica shards as you had suggested. This solved the problem & inserts are faster now. Thanks @Christian_Dahlqvist for helping out.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.