How to force replica placement in cluster


#1

Dear Folks,

how is it possible to force OR prevent a node from receiving/getting replica shards?

We already know this here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cluster.html#allocation-awareness

But it didn't really answer the question how to force only specific nodes to be "replica nodes" and others not because they are not always available?

Thanks a lot!
Steph


(Mark Walkom) #2

You cannot do this.


#3

Thanks for your answer! Too bad, so there is no way to make use of Amazon Spot Instances?

The setup to evaluate:
2 Core Nodes ("Always" available spread over to availability zones)
1 Spot Instance (Not always available)

Default index: 5 Shards, 1 Replica

That should really be possible (by some kind of filters similar to shard allocation), the shutdown of some instance types (spot instances) on AWS does only happen rarelly, so this would be a great way to improve costs...

There is one guy build something like this, but I think it's still not "safe" regarding the replica placement:

I think there is still the risk situation that one core node fails and at the same time the spot instance goes down = cluster broken...

Any suggestions?

Best
Steph


(Colin Goodheart-Smithe) #4

You shouldn't need to force particular nodes to only allocate replica shards here. There is no difference between the data stored on a primary and replica shard. The only difference is that Elasticsearch labels one of them as a Primary. If the primary shard disappears, the master node will change one of the remaining replicas of that shard to a primary. So if your 'spot instance' contained a primary shard and was shutdown, then a replica on one of the other nodes would be promoted to a primary.

As stated in the blog you linked to, you will want to set allocation attributes so that you can guarantee that at least one copy of every shard is located on a 'core node'


#5

Thanks for your reply!

The problem is when I use the exact same settings like in the blog post (one core group and one spot group) then the spot group only receives replica shards and the nodes are totally idle (no CPU or IO load). I red that replica shards should be used as read nodes but it doesn't seem to work with the manual shard allocation setting. The core nodes are running at 90% CPU on the same time while the spot instances are sleeping... I'm not sure if this is the intended behaviour..

Thanks and best
Steph


#6

Maybe I should file a bug report on Github?


(Mark Walkom) #7

It's not a bug.

Are you querying via these spot nodes?


#8

I have two master nodes with data (false) and master (true) – all other nodes in the cluster has data (true) and master (false). The application only accesses the two master nodes (no direct connection to any data node). But the masters are routing the queries only two one of the core nodes which hangs at high load while the other nodes being lazy.


(Mark Walkom) #9

Two things.

  1. Having an odd number of masters is much better, you run less chance of a split brain.
  2. If you are using dedicated masters, DO NOT query or index through them. The whole point of splitting them out it to ensure optimal operations of your cluster. Pushing query and indexing through them runs the risk of OOM happening, which negates the whole purpose of splitting them out.

#10

That is a good suggestion, thank you! I will change this... do you think that will fix the issue of the single node performance issue?

Update: Already tested it... now the load is more evenly distributed to the data nodes. You helped us a lot, thanks again! I will checkout the _local preference as well...


(Mark Walkom) #11

Nope, you need to send queries to these replica nodes. You can also use _local preferencing to get queries to try the local (ie replica) copy first.


(system) #12