Cluster fails to elect master after using new data store

We have a 2-node cluster running on an Openshift platform. Recently, we installed new NFS volume shares to be used as data stores for the cluster, and we want to make a new cluster using those volumes. After deleting the old cluster StatefulSet, we deploy a new one which mounts the new volume shares at /usr/elasticsearch/data. After both nodes start up, the logs show they are able to discover each other through the k8s service used for discovery. However, they are unable to elect either one as master.

Below are the logs of the "elasticsearch-master-0" pod:

{"@timestamp":"2023-06-07T12:49:54.160Z", "log.level": "INFO", "message":"node name [elasticsearch-master-0], node ID [phv46TIaT0-ymV_7nVkriQ], cluster name [elasticsearch], roles [ingest, ml, master, data, remote_cluster_client]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.node.Node","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:24.256Z", "log.level": "INFO", "message":"[controller/523] [Main.cc@123] controller (64 bit): Version 8.2.3 (Build 537f37a54d22f1) Copyright (c) 2022 Elasticsearch BV", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"ml-cpp-log-tail-thread","log.logger":"org.elasticsearch.xpack.ml.process.logging.CppLogMessageHandler","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:25.156Z", "log.level": "INFO", "message":"Security is enabled", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.xpack.security.Security","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:28.759Z", "log.level": "INFO", "message":"license mode is [trial], currently licensed security realms are [reserved/reserved,file/default_file,native/default_native]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.xpack.security.authc.Realms","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:28.851Z", "log.level": "INFO", "message":"parsed [0] roles from file [/usr/share/elasticsearch/config/roles.yml]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.xpack.security.authz.store.FileRolesStore","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:37.149Z", "log.level": "INFO", "message":"creating NettyAllocator with the following configs: [name=elasticsearch_configured, chunk_size=1mb, suggested_max_allocation_size=1mb, factors={es.unsafe.use_netty_default_chunk_and_page_size=false, g1gc_enabled=true, g1gc_region_size=4mb}]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.transport.netty4.NettyAllocator","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:37.359Z", "log.level": "INFO", "message":"using rate limit [40mb] with [default=40mb, read=0b, write=0b, max=0b]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.indices.recovery.RecoverySettings","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:37.553Z", "log.level": "INFO", "message":"using discovery type [multi-node] and seed hosts providers [settings]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.discovery.DiscoveryModule","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:45.947Z", "log.level": "INFO", "message":"initialized", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.node.Node","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:45.948Z", "log.level": "INFO", "message":"starting ...", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.node.Node","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:47.062Z", "log.level": "INFO", "message":"persistent cache index loaded", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.xpack.searchablesnapshots.cache.full.PersistentCache","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:47.063Z", "log.level": "INFO", "message":"deprecation component started", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.xpack.deprecation.logging.DeprecationIndexingComponent","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:47.954Z", "log.level": "INFO", "message":"publish_address {10.131.12.67:9300}, bound_addresses {[::]:9300}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.transport.TransportService","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:54.857Z", "log.level": "INFO", "message":"bound or publishing to a non-loopback address, enforcing bootstrap checks", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.BootstrapChecks","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:54.956Z", "log.level": "INFO", "message":"cluster UUID [0Y1NNyQyT7K8Km3vrCeYPA]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.cluster.coordination.Coordinator","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:51:05.055Z", "log.level": "WARN", "message":"master not discovered or elected yet, an election requires a node with id [mD5UyLYxTIW3uZ4LxArmkw], have only discovered non-quorum [{elasticsearch-master-0}{phv46TIaT0-ymV_7nVkriQ}{arqGZH88S3WisRmnRktRXA}{10.131.12.67}{10.131.12.67:9300}{dilmr}, {elasticsearch-master-1}{bNF9V12pQeK4x-kNDpTTsQ}{pmGe92wrT8C5m9pDYY6wPw}{10.128.14.130}{10.128.14.130:9300}{dilmr}]; discovery will continue using [10.128.14.130:9300] from hosts providers and [{elasticsearch-master-0}{phv46TIaT0-ymV_7nVkriQ}{arqGZH88S3WisRmnRktRXA}{10.131.12.67}{10.131.12.67:9300}{dilmr}] from last-known cluster state; node term 3, last-accepted version 498 in term 3", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-0][cluster_coordination][T#1]","log.logger":"org.elasticsearch.cluster.coordination.ClusterFormationFailureHelper","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:51:15.057Z", "log.level": "WARN", "message":"master not discovered or elected yet, an election requires a node with id [mD5UyLYxTIW3uZ4LxArmkw], have only discovered non-quorum [{elasticsearch-master-0}{phv46TIaT0-ymV_7nVkriQ}{arqGZH88S3WisRmnRktRXA}{10.131.12.67}{10.131.12.67:9300}{dilmr}, {elasticsearch-master-1}{bNF9V12pQeK4x-kNDpTTsQ}{pmGe92wrT8C5m9pDYY6wPw}{10.128.14.130}{10.128.14.130:9300}{dilmr}]; discovery will continue using [10.128.14.130:9300] from hosts providers and [{elasticsearch-master-0}{phv46TIaT0-ymV_7nVkriQ}{arqGZH88S3WisRmnRktRXA}{10.131.12.67}{10.131.12.67:9300}{dilmr}] from last-known cluster state; node term 3, last-accepted version 498 in term 3", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-0][cluster_coordination][T#1]","log.logger":"org.elasticsearch.cluster.coordination.ClusterFormationFailureHelper","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}

And the logs for "elasticsearch-master-1":

{"@timestamp":"2023-06-07T12:49:44.712Z", "log.level": "INFO", "message":"node name [elasticsearch-master-1], node ID [bNF9V12pQeK4x-kNDpTTsQ], cluster name [elasticsearch], roles [master, data, remote_cluster_client, ingest, ml]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.node.Node","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:16.802Z", "log.level": "INFO", "message":"[controller/524] [Main.cc@123] controller (64 bit): Version 8.2.3 (Build 537f37a54d22f1) Copyright (c) 2022 Elasticsearch BV", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"ml-cpp-log-tail-thread","log.logger":"org.elasticsearch.xpack.ml.process.logging.CppLogMessageHandler","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:17.715Z", "log.level": "INFO", "message":"Security is enabled", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.xpack.security.Security","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:19.801Z", "log.level": "INFO", "message":"license mode is [trial], currently licensed security realms are [reserved/reserved,file/default_file,native/default_native]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.xpack.security.authc.Realms","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:19.809Z", "log.level": "INFO", "message":"parsed [0] roles from file [/usr/share/elasticsearch/config/roles.yml]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.xpack.security.authz.store.FileRolesStore","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:27.906Z", "log.level": "INFO", "message":"creating NettyAllocator with the following configs: [name=elasticsearch_configured, chunk_size=1mb, suggested_max_allocation_size=1mb, factors={es.unsafe.use_netty_default_chunk_and_page_size=false, g1gc_enabled=true, g1gc_region_size=4mb}]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.transport.netty4.NettyAllocator","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:28.112Z", "log.level": "INFO", "message":"using rate limit [40mb] with [default=40mb, read=0b, write=0b, max=0b]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.indices.recovery.RecoverySettings","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:28.306Z", "log.level": "INFO", "message":"using discovery type [multi-node] and seed hosts providers [settings]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.discovery.DiscoveryModule","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:35.718Z", "log.level": "INFO", "message":"initialized", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.node.Node","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:35.718Z", "log.level": "INFO", "message":"starting ...", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.node.Node","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:35.941Z", "log.level": "INFO", "message":"persistent cache index loaded", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.xpack.searchablesnapshots.cache.full.PersistentCache","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:35.998Z", "log.level": "INFO", "message":"deprecation component started", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.xpack.deprecation.logging.DeprecationIndexingComponent","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:36.903Z", "log.level": "INFO", "message":"publish_address {10.128.14.130:9300}, bound_addresses {[::]:9300}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.transport.TransportService","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:38.188Z", "log.level": "INFO", "message":"bound or publishing to a non-loopback address, enforcing bootstrap checks", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.BootstrapChecks","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:48.203Z", "log.level": "WARN", "message":"master not discovered or elected yet, an election requires two nodes with ids [phv46TIaT0-ymV_7nVkriQ, bNF9V12pQeK4x-kNDpTTsQ], have only discovered non-quorum [{elasticsearch-master-1}{bNF9V12pQeK4x-kNDpTTsQ}{pmGe92wrT8C5m9pDYY6wPw}{10.128.14.130}{10.128.14.130:9300}{dilmr}]; discovery will continue using [10.131.12.67:9300] from hosts providers and [{elasticsearch-master-1}{bNF9V12pQeK4x-kNDpTTsQ}{pmGe92wrT8C5m9pDYY6wPw}{10.128.14.130}{10.128.14.130:9300}{dilmr}] from last-known cluster state; node term 0, last-accepted version 0 in term 0", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-1][cluster_coordination][T#1]","log.logger":"org.elasticsearch.cluster.coordination.ClusterFormationFailureHelper","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:50:58.212Z", "log.level": "WARN", "message":"master not discovered or elected yet, an election requires two nodes with ids [phv46TIaT0-ymV_7nVkriQ, bNF9V12pQeK4x-kNDpTTsQ], have only discovered non-quorum [{elasticsearch-master-1}{bNF9V12pQeK4x-kNDpTTsQ}{pmGe92wrT8C5m9pDYY6wPw}{10.128.14.130}{10.128.14.130:9300}{dilmr}]; discovery will continue using [10.131.12.67:9300] from hosts providers and [{elasticsearch-master-1}{bNF9V12pQeK4x-kNDpTTsQ}{pmGe92wrT8C5m9pDYY6wPw}{10.128.14.130}{10.128.14.130:9300}{dilmr}] from last-known cluster state; node term 0, last-accepted version 0 in term 0", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-1][cluster_coordination][T#1]","log.logger":"org.elasticsearch.cluster.coordination.ClusterFormationFailureHelper","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:51:08.214Z", "log.level": "WARN", "message":"master not discovered or elected yet, an election requires two nodes with ids [phv46TIaT0-ymV_7nVkriQ, bNF9V12pQeK4x-kNDpTTsQ], have discovered possible quorum [{elasticsearch-master-1}{bNF9V12pQeK4x-kNDpTTsQ}{pmGe92wrT8C5m9pDYY6wPw}{10.128.14.130}{10.128.14.130:9300}{dilmr}, {elasticsearch-master-0}{phv46TIaT0-ymV_7nVkriQ}{arqGZH88S3WisRmnRktRXA}{10.131.12.67}{10.131.12.67:9300}{dilmr}]; discovery will continue using [10.131.12.67:9300] from hosts providers and [{elasticsearch-master-1}{bNF9V12pQeK4x-kNDpTTsQ}{pmGe92wrT8C5m9pDYY6wPw}{10.128.14.130}{10.128.14.130:9300}{dilmr}] from last-known cluster state; node term 0, last-accepted version 0 in term 0", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-1][cluster_coordination][T#1]","log.logger":"org.elasticsearch.cluster.coordination.ClusterFormationFailureHelper","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}

The logs show that the two nodes expect different Node ID's as the potential masters, with elasticsearch-master-1 having the ID's of the two nodes, while elasticsearch-master-0 expects a random ID.

For now, we have recreated a cluster with the old volume shares, which works fine. Below are the logs for the cluster mounted with the old volume shares:

{"@timestamp":"2023-06-07T12:57:01.460Z", "log.level": "INFO", "message":"node name [elasticsearch-master-0], node ID [t93wdPZiQ8W90vWLLZIjKg], cluster name [elasticsearch], roles [ingest, remote_cluster_client, data, master, ml]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.node.Node","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
...
{"@timestamp":"2023-06-07T12:57:56.962Z", "log.level": "INFO", "message":"publish_address {10.131.12.68:9300}, bound_addresses {[::]:9300}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.transport.TransportService","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:58:04.502Z", "log.level": "INFO", "message":"bound or publishing to a non-loopback address, enforcing bootstrap checks", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.BootstrapChecks","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:58:04.649Z", "log.level": "INFO", "message":"cluster UUID [_rQljAVfT22SA0PbM59gJA]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.cluster.coordination.Coordinator","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:58:14.660Z", "log.level": "WARN", "message":"master not discovered or elected yet, an election requires at least 2 nodes with ids from [9pBMe4nxQE6GXRXd9Nd6og, Vltf5NjxT8eXC9R7fWkN4A, t93wdPZiQ8W90vWLLZIjKg], have only discovered non-quorum [{elasticsearch-master-0}{t93wdPZiQ8W90vWLLZIjKg}{FR3WXfk0TnG3F1aHBqQXow}{10.131.12.68}{10.131.12.68:9300}{dilmr}]; discovery will continue using [10.128.14.131:9300] from hosts providers and [{elasticsearch-master-0}{t93wdPZiQ8W90vWLLZIjKg}{FR3WXfk0TnG3F1aHBqQXow}{10.131.12.68}{10.131.12.68:9300}{dilmr}] from last-known cluster state; node term 173, last-accepted version 5980 in term 173", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-0][cluster_coordination][T#1]","log.logger":"org.elasticsearch.cluster.coordination.ClusterFormationFailureHelper","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:58:16.951Z", "log.level": "INFO", "message":"elected-as-master ([2] nodes joined)[_FINISH_ELECTION_, {elasticsearch-master-0}{t93wdPZiQ8W90vWLLZIjKg}{FR3WXfk0TnG3F1aHBqQXow}{10.131.12.68}{10.131.12.68:9300}{dilmr} completing election, {elasticsearch-master-1}{9pBMe4nxQE6GXRXd9Nd6og}{kN1jfU-LSf-8m-XR_OuUFA}{10.128.14.131}{10.128.14.131:9300}{dilmr} completing election], term: 176, version: 5981, delta: master node changed {previous [], current [{elasticsearch-master-0}{t93wdPZiQ8W90vWLLZIjKg}{FR3WXfk0TnG3F1aHBqQXow}{10.131.12.68}{10.131.12.68:9300}{dilmr}]}, added {{elasticsearch-master-1}{9pBMe4nxQE6GXRXd9Nd6og}{kN1jfU-LSf-8m-XR_OuUFA}{10.128.14.131}{10.128.14.131:9300}{dilmr}}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-0][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.service.MasterService","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:58:18.611Z", "log.level": "INFO", "message":"master node changed {previous [], current [{elasticsearch-master-0}{t93wdPZiQ8W90vWLLZIjKg}{FR3WXfk0TnG3F1aHBqQXow}{10.131.12.68}{10.131.12.68:9300}{dilmr}]}, added {{elasticsearch-master-1}{9pBMe4nxQE6GXRXd9Nd6og}{kN1jfU-LSf-8m-XR_OuUFA}{10.128.14.131}{10.128.14.131:9300}{dilmr}}, term: 176, version: 5981, reason: Publication{term=176, version=5981}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-0][clusterApplierService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.service.ClusterApplierService","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:58:18.858Z", "log.level": "INFO", "message":"publish_address {10.131.12.68:9200}, bound_addresses {[::]:9200}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.http.AbstractHttpServerTransport","elasticsearch.cluster.uuid":"_rQljAVfT22SA0PbM59gJA","elasticsearch.node.id":"t93wdPZiQ8W90vWLLZIjKg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:57:06.969Z", "log.level": "INFO", "message":"node name [elasticsearch-master-1], node ID [9pBMe4nxQE6GXRXd9Nd6og], cluster name [elasticsearch], roles [remote_cluster_client, data, master, ml, ingest]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.node.Node","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
...
{"@timestamp":"2023-06-07T12:58:01.508Z", "log.level": "INFO", "message":"publish_address {10.128.14.131:9300}, bound_addresses {[::]:9300}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.transport.TransportService","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:58:11.304Z", "log.level": "INFO", "message":"[gc][11] overhead, spent [293ms] collecting in the last [1s]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-1][scheduler][T#1]","log.logger":"org.elasticsearch.monitor.jvm.JvmGcMonitorService","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:58:12.040Z", "log.level": "INFO", "message":"bound or publishing to a non-loopback address, enforcing bootstrap checks", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.BootstrapChecks","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:58:12.099Z", "log.level": "INFO", "message":"cluster UUID [_rQljAVfT22SA0PbM59gJA]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.cluster.coordination.Coordinator","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:58:16.704Z", "log.level": "INFO", "message":"elected-as-master ([2] nodes joined)[_FINISH_ELECTION_, {elasticsearch-master-0}{t93wdPZiQ8W90vWLLZIjKg}{FR3WXfk0TnG3F1aHBqQXow}{10.131.12.68}{10.131.12.68:9300}{dilmr} completing election, {elasticsearch-master-1}{9pBMe4nxQE6GXRXd9Nd6og}{kN1jfU-LSf-8m-XR_OuUFA}{10.128.14.131}{10.128.14.131:9300}{dilmr} completing election], term: 175, version: 5981, delta: master node changed {previous [], current [{elasticsearch-master-1}{9pBMe4nxQE6GXRXd9Nd6og}{kN1jfU-LSf-8m-XR_OuUFA}{10.128.14.131}{10.128.14.131:9300}{dilmr}]}, added {{elasticsearch-master-0}{t93wdPZiQ8W90vWLLZIjKg}{FR3WXfk0TnG3F1aHBqQXow}{10.131.12.68}{10.131.12.68:9300}{dilmr}}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-1][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.service.MasterService","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2023-06-07T12:58:16.708Z", "log.level": "WARN", "message":"failing [elected-as-master ([2] nodes joined)[_FINISH_ELECTION_, {elasticsearch-master-0}{t93wdPZiQ8W90vWLLZIjKg}{FR3WXfk0TnG3F1aHBqQXow}{10.131.12.68}{10.131.12.68:9300}{dilmr} completing election, {elasticsearch-master-1}{9pBMe4nxQE6GXRXd9Nd6og}{kN1jfU-LSf-8m-XR_OuUFA}{10.128.14.131}{10.128.14.131:9300}{dilmr} completing election]]: failed to commit cluster state version [5981]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-1][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.service.MasterService","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch","error.type":"org.elasticsearch.cluster.coordination.FailedToCommitClusterStateException","error.message":"node is no longer master for term 175 while handling publication","error.stack_trace":"org.elasticsearch.cluster.coordination.FailedToCommitClusterStateException: node is no longer master for term 175 while handling publication\n\tat org.elasticsearch.cluster.coordination.Coordinator.publish(Coordinator.java:1347)\n\tat org.elasticsearch.cluster.service.MasterService.publish(MasterService.java:416)\n\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:309)\n\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:153)\n\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:114)\n\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:170)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:714)\n\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:260)\n\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:223)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)\n\tat java.base/java.lang.Thread.run(Thread.java:833)\n"}
{"@timestamp":"2023-06-07T12:58:18.511Z", "log.level": "INFO", "message":"master node changed {previous [], current [{elasticsearch-master-0}{t93wdPZiQ8W90vWLLZIjKg}{FR3WXfk0TnG3F1aHBqQXow}{10.131.12.68}{10.131.12.68:9300}{dilmr}]}, added {{elasticsearch-master-0}{t93wdPZiQ8W90vWLLZIjKg}{FR3WXfk0TnG3F1aHBqQXow}{10.131.12.68}{10.131.12.68:9300}{dilmr}}, term: 176, version: 5981, reason: ApplyCommitRequest{term=176, version=5981, sourceNode={elasticsearch-master-0}{t93wdPZiQ8W90vWLLZIjKg}{FR3WXfk0TnG3F1aHBqQXow}{10.131.12.68}{10.131.12.68:9300}{dilmr}{ml.machine_memory=5153959936, xpack.installed=true, ml.max_jvm_size=2621440000}}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-1][clusterApplierService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.service.ClusterApplierService","elasticsearch.node.name":"elasticsearch-master-1","elasticsearch.cluster.name":"elasticsearch"}

The cluster has the environment variables "discovery.seed_hosts=elasticsearch-master-headless" (a k8s service) and "discovery.initial_master_nodes=elasticsearch-master-0,elasticsearch-master-1,". Both nodes have the roles ["master", "ingest", "data", "remote_cluster_client", "ml"].

I also noticed that both the old and new clusters keep creating the same node ID's shown in the logs. I don't know if that's significant.

The troubleshooting guide in the manual is somewhat helpful here. It's not clear what exactly you're doing, but the node ID is tied to the contents of the data path so if the node IDs are different then the data path contents are not being preserved.

Presumably you mean cluster.initial_master_nodes? But according to these docs:

IMPORTANT: After the cluster forms successfully for the first time, remove the cluster.initial_master_nodes setting from each node’s configuration. Do not use this setting when restarting a cluster or adding a new node to an existing cluster.

Hi,

Our situation is quite simply that we want to delete our current cluster and start a completely fresh new cluster with new, empty volume shares as persistent data store.

As I understand it, we need the "cluster.initial_master_nodes" variable for initial setup.

I noticed that, after the new cluster was created for the first time, the second node wrote the expected files in the data path (_state, node.lock, nodes and snapshot_cache), but the first node did not. The directory is still empty. I will check with our infrastructure team if there is an issue with this volume share and post an update.

Ah ok, yes you do need cluster.initial_master_nodes for the initial setup in that case. But if the cluster were completely fresh then it wouldn't be reporting an election requires at least 2 nodes with ids from [9pBMe4nxQE6GXRXd9Nd6og, Vltf5NjxT8eXC9R7fWkN4A, t93wdPZiQ8W90vWLLZIjKg]. These node IDs are held in the data path, so it seems your new cluster is not starting completely afresh.

Hi,

The ID's you cited are from the second set of logs in the initial post, belonging to the old, stable cluster, which I added by way of comparison. The first set of logs comes from the new cluster, where the first node expects ID "mD5UyLYxTIW3uZ4LxArmkw", while the second expects
[phv46TIaT0-ymV_7nVkriQ, bNF9V12pQeK4x-kNDpTTsQ].

As such, it is unsurprising that the old cluster is not fresh. Apologies for the confusion.

Ah, sorry for the misunderstanding. But the gist of my message remains: the node IDs phv46TIaT0-ymV_7nVkriQ and bNF9V12pQeK4x-kNDpTTsQ are coming from some other non-empty data paths.

It turns out there was a small typo in our helm charts which meant we were not actually deleting the cluster data from the previous cluster before setting up the new cluster. Thus, the first node was using existing cluster data, while the second node was trying to create a new cluster. After fixing the typo, we deleted all the old cluster data so the two nodes performed a proper clean install. This time it worked.
So in short, it was a typo on our end. Thank you for your time and help.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.