Two node cluster with all primary shards in Node 1 and all replica shards in Node 2

Dear all, I have an Elasticsearch cluster composed by two nodes: master node and data node.

When I join the second node to the cluster, after some hours, the cluster health becomes green but I can see that all primary shards are in node 1 and all replica shards are in node 2.

I've thought that half of primary shards stay in node 1 and the other half of them migrate to node 2, and the same with repolica shards, but it's not on the way I supposed.

Node 1:

Node 2:

Please, can you tell me if this shard balancing is OK ??? Or how can I do to migrate half the primary shards to Node 2 and half the replica shards to Node 1 ???

Special thanks !!!

Hi Robert,

Did you restart node 2 or both nodes after collection creation ?

Regards

Dominique

Hi Dominique, yes I have restarted both nodes.

But an hour ago, I stopped Elasticsearch service in Node 1, and after that I can see primary shards in both nodes, I can see the shard balancing.

In the Stack Monitoring section from Kibana, when the Elasticsearch service was Dow in Node 1, I am still seeing the Node 1 status OK....Why ???

Thanks a lot again!!!and

Robert,

May be the restart was to short in order the monitoring reports the offline status.

I just killed one node of a 3 nodes test cluster. It is a container under docker with autorestart. I could see a few seconds the status switch to "offline".

I suppose if you go in the monitoring details for your node, you will see the restart.

Dominique

Dear Dominique, if I stop Elasticsearch service in node 2 (data node) it appears OFFLINE in the stack monitoring section.

But if I stop Elasticsearch service in node 1 (master node) it continues appearing ONLINE.

Any idea please?

Regards!!!

What is your worry? If you started with one node it'll of course have all primaries, then you added a second empty node and the first priority is likely to setup replicas which it'll allocate on node 2; when that is done, it'll think about rebalancing (actually concurrent, but I think replica building is higher priority than rebalancing, as all indexes will be yellow until that's done).

Then the question is do we care where the shards are? I don't think so - any new data has to go to both primary & replica, so indexing load is spread to both nodes, and queries can be serviced from either primary or replica so that load is also spread.

And if a node fails, like node1, all replicas will be promoted and things will be fine; when nod1 comes back, its shards will be rebuilt as replicas.

Note having only two nodes is not very reliable, as if both are masters and you lose one, the cluster will have problems and if both not masters and you lose the master, the cluster will stop. See the docs on small clusters.

1 Like

Ok Steve, thanks a lot, I'll read the doc you mentioned.

I think I have to have two data nodes in place of one master and one data nodes.

Please, the last question:

In each beat client from remote servers I've configured:

output.elasticsearch:
hosts: ["node1.mycompany.com:9200", "node2.mycompany.com:9200"]

I suppose the client tries to connect to ES node 1, and if it's down, it tries to connect to ES node 2. But if I stop elasticsearch service in node 1, the beat data from cliente don't reach node 2. Is this ok or I have to do any extra configuration?

Thanking in advance!!!

Regards.

You need two data nodes for sure to have any replicas, and best is then to also have them both masters, but also add a small third voting-only master-eligible node that has no data, just helps keep the masters organized; then it's all good.

For your outputs; I'm not sure while filebeat handles two nodes but the docs say it: "If one node becomes unreachable, the event is automatically sent to another node." Should work then, though you might test each separately, i.e. only list node2 in the output and see if that works.

Make sure those servers are not public on the Internet anyway so evil hackers can't reach them (this can be a problem when the clients are remote).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.