Where should "cluster.routing.allocation.same_shard.host" be set?

Hi Folks.

Can anyone confirm that when I have dedicated masters, "cluster.routing.allocation.same_shard.host" is only relevant on the master nodes?

Also, can I assume all cluster.routing.* settings only apply to the node(s) serving as the master?

Allocation applies to any node that holds data.

That makes sense but which node roles (master vs. data) is this setting actually relevant?

Assuming you had:
-10 data nodes (not eligible for the master role)
-3 master nodes (not eligible for the data role)

Would same_shard.host be required on the data nodes? master nodes? both?

I've yet to find the time to dig into the ES source to understand shard allocations in detail.

  1. Do the masters read the state of the cluster and make all shard allocation decisions internally?
    -In this scenario, same_shard.host seems like it would be only relevant on the master where the decisions are being made
    or
  2. Do the masters ask the individual data nodes to make the allocation decisions?
    -In this scenario, the same_shard.host could be relevant on the data nodes or master nodes (depending on implementation)

My assumption would be #1 - but I didn't build ES so I need to ask.

Doing a quick google search for this setting suggests the ES file below is where this setting is consumed. Perhaps a more direct question would be - Does the master or data role use this class/code?

/src/main/java/org/elasticsearch/cluster/routing/allocation/decider/SameShardAllocationDecider.java

  1. The elected master does, yes (the others are there for redundancy and don't play a part in allocation).
  2. No, the master node makes those decisions.

cluster.routing.allocation.same_shard.host really only applies if you have multiple ES instances (nodes) running on the same underlying physical/virtual host. See here.

Thanks, Mark.

As I expected, that setting is relevant to the master nodes and that explains my issue.
I'm transitioning my cluster to new hardware where I'm running multiple ES instances per machine. The nodes themselves have same_shard.host set but not the existing cluster master(s). We noticed the shard allocation was not respecting this flag which makes sense if it's a setting relevant to the master.

For background, I'm in 1.7.1 and for a few reasons not worth going into, my masters aren't in a position to be restarted (with the same_shard.host setting) just yet. For now I'll use the shard allocation awareness settings to try to pull this off. I have significantly more physical machines than shards per index so this shouldn't be an issue.

Thanks again.

I have the same question and it looks like I haven't found a final answer.

So if I run multiple data nodes on a single server, which elasticsearch.yml should I put cluster.routing.allocation.same_shard.host in?

  1. Only data nodes
  2. Only eligible master nodes (dedicated)?
  3. Both data nodes and eligible master nodes?

Even though I'm using Shard Allocation Awareness now, knowing the answer to this setting is helpful in the future.

All of them.

1 Like