Restart Elasticsearch cluster

Hello
I have 6 elasticsearch instanse are runnig in docker containers on 3 physical node. When one of the elasticsearch instanse is restrating, all of indexs become inaccessible. How i can fix it? Config of elasticsearch such:

cluster.name: my-cluster

node.attr.rack: r1
cluster.routing.allocation.awareness.attributes: rack_id

bootstrap.memory_lock: true

network.host: 192.168.1.2

discovery.zen.ping.unicast.hosts: 
    - 192.168.1.2
    - 192.168.1.3
    - 192.168.1.4
    - 192.168.1.5
    - 192.168.1.6
    - 192.168.1.7

discovery.zen.minimum_master_nodes: 4


cluster.initial_master_nodes:
    - node01
    - node02
    - node03
    - node04
    - node05
    - node06

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.keystore.type: PKCS12
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: cert.p12
xpack.security.transport.ssl.truststore.path: cert.p12
xpack.security.transport.ssl.truststore.type: PKCS12

# Auth method
xpack:
  security:
     authc:
       realms:
          native:
             native1:
                order: 0

What do the logs on the master show when you restart the node?

Elastic logging is not enabled, to enable it , need to restart the nodes. Rebooting each node makes the indexes unavailable for 30 minutes, which is critical for business

Are your indices configured to have a replica?

Yes, all indexes have a replica, but when one of the nodes is reloaded, all indexes go into UNASSIGNED status.

Could it be due to the settings rack?

Yes, that could be the case. How is rack_id set on the different nodes? How many replicas do you have configured? It also looks like you are setting the attribute rack but have instead specified rack_id to be used. Is this a copy-paste problem or actually in the config?

Yes, this is the actual config. I also noticed that this attribute differs from the one described here:

One index has one replica.

The settings on the other nodes are as follows:
node01:
node.attr.rack: r1
cluster.routing.allocation.awareness.attributes: rack_id

node02:
node.attr.rack: r1
cluster.routing.allocation.awareness.attributes: rack_id

node03:
node.attr.rack: r2
cluster.routing.allocation.awareness.attributes: rack_id

node04:
node.attr.rack: r2
cluster.routing.allocation.awareness.attributes: rack_id

node05:
node.attr.rack: r3
cluster.routing.allocation.awareness.attributes: rack_id

node06:
node.attr.rack: r3
cluster.routing.allocation.awareness.attributes: rack_id

Nodes node01 and node02 are the first physical server, node03 and node04 are the second, node05 and node06 are the third

You need to correct that so it is consistent.

after I fixed parameter node.attr.rack to node.attr.rack_id, the replica indexes were at UNASSIGNED until I returned the parameter node.attr.rack back. There was such an error in the log:
node does not contain the awareness attribute [rack]; required attributes cluster setting [cluster.routing.allocation.awareness.attributes=rack

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.