Elasticsearch Cluster topology

Hello Experts,

I am new to elastic search world, I need advice setting with the cluster topology. I am planning to have 5 nodes in the cluster. Can you guys suggest

  1. how many nodes should act as Master ?
  2. how many nodes should act as Data Node ?
  3. at some places I have seen it advisable to have master and data nodes separate, like to better utilize the cluster nodes. Initial thoughts coming to my mind is to have 3 nodes (Master/Data both) and 2 nodes dedicated as data nodes.
  4. F5 Setup: We are planning to to use F5 in front of cluster for app.App will have F5 VIP to connect to ES cluster, Which nodes F5 should points, (should it round robin between data nodes only or should it round robin between all nodes Master or Data nodes?
    5.In my understanding ES splits the index in shards. For example if I have one index with 10 shards (spread on 2 data nodes) so 5 shards per node. If one data node goes down , only 1 Data node serves the trafic.
  • Will I only see partial data ? if this is true,
  • I don't see any option to set replication at cluster level, Please suggest if there is any other way to do that.
  • At Index level we can have replica as 1 configuration, but some times app team can miss that , What's the best way to handle this ?

Thanks
DK

Guys, Can you please suggest on the above questions asked. Let me know if you need any other information.

Thanks
DK

Read this and specifically the "Also be patient" part.

This is correct, you would have 3 master nodes and all 5 as data nodes. You could also keep all master nodes but set the min master nodes to 3.

Ans: Its the data nodes that serve and depending on your endpoint namely 9200 for REST based calls you would expose all the 5 nodes

Depends on the shard that goes done.
If the replica shard goes down, you would see the current data.
If the primary goes after the copy over to the replica shard you are still fine.
If the primary goes down before the copy for most index operation there is a transaction logs that writes both to the primary and replica which would still not loose your data and replay those messages to the replica when it becomes the new primary.

As of now the basic unit is an index, so you would need to enforce some rules on index creation. You can also add or remove replication anytime to correct such mistakes.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.