I am new to elastic search world, I need advice setting with the cluster topology. I am planning to have 5 nodes in the cluster. Can you guys suggest
how many nodes should act as Master ?
how many nodes should act as Data Node ?
at some places I have seen it advisable to have master and data nodes separate, like to better utilize the cluster nodes. Initial thoughts coming to my mind is to have 3 nodes (Master/Data both) and 2 nodes dedicated as data nodes.
F5 Setup: We are planning to to use F5 in front of cluster for app.App will have F5 VIP to connect to ES cluster, Which nodes F5 should points, (should it round robin between data nodes only or should it round robin between all nodes Master or Data nodes? 5.In my understanding ES splits the index in shards. For example if I have one index with 10 shards (spread on 2 data nodes) so 5 shards per node. If one data node goes down , only 1 Data node serves the trafic.
Will I only see partial data ? if this is true,
I don't see any option to set replication at cluster level, Please suggest if there is any other way to do that.
At Index level we can have replica as 1 configuration, but some times app team can miss that , What's the best way to handle this ?
This is correct, you would have 3 master nodes and all 5 as data nodes. You could also keep all master nodes but set the min master nodes to 3.
Ans: Its the data nodes that serve and depending on your endpoint namely 9200 for REST based calls you would expose all the 5 nodes
Depends on the shard that goes done.
If the replica shard goes down, you would see the current data.
If the primary goes after the copy over to the replica shard you are still fine.
If the primary goes down before the copy for most index operation there is a transaction logs that writes both to the primary and replica which would still not loose your data and replay those messages to the replica when it becomes the new primary.
As of now the basic unit is an index, so you would need to enforce some rules on index creation. You can also add or remove replication anytime to correct such mistakes.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.