Nodes are offline

Hi, dears.
I deployed a 3-node elastic cluster with default docker-compose in the Elastic document.
but expect the master node, other nodes are offline.

Are there any additional settings that I have to do?
also why master node's role is N/A?

Which version of Elasticsearch are you using?

What does your node configurations look like?

What is in the Elasticsearch logs?

I'm using version 8.8.0.

elasticsearch.yml for master node:

network.host: 0.0.0.0
node.name: es01-master
node.roles: ["master", "remote_cluster_client"]
discovery.type: multi-node
cluster.initial_master_nodes: es01-master
discovery.seed_hosts: ["es02-data", "es03-data"]
bootstrap.memory_lock: true
xpack.security.enabled: true
ingest.geoip.downloader.enabled: false
xpack.monitoring.collection.enabled: true

Also, I have two data nodes that their elasticsearch.yml's are same:

network.host: 0.0.0.0
node.name: es02-data
node.roles: ["data"]
discovery.type: multi-node
cluster.initial_master_nodes: es01-master
discovery.seed_hosts: ["es01-master", "es03-data"]
bootstrap.memory_lock: true
xpack.security.enabled: true
ingest.geoip.downloader.enabled: false

There is not a unusual logs in log file, but when I grep data nodes in master node container I got this log:

@Christian_Dahlqvist

Does this resolve to the correct IP address?

they are connected in two networks. I use curl for connection checking and there are seeing each other.

root@elk1:/opt/elk# docker inspect es01-master | grep IPAddress
            "SecondaryIPAddresses": null,
            "IPAddress": "",
                    "IPAddress": "192.168.176.3",
                    "IPAddress": "172.22.0.3",
root@elk1:/opt/elk# docker inspect es02-data | grep IPAddress
            "SecondaryIPAddresses": null,
            "IPAddress": "",
                    "IPAddress": "192.168.176.4",
                    "IPAddress": "172.22.0.4",
root@elk1:/opt/elk# docker inspect es03-data | grep IPAddress
            "SecondaryIPAddresses": null,
            "IPAddress": "",
                    "IPAddress": "192.168.176.5",
                    "IPAddress": "172.22.0.5",

@Christian_Dahlqvist

Your discovery.seed_hosts is wrong, this setting needs to have only the master eligibles nodes, your es03-data is not master eligible, you need to remove it from this setting, I don't think this is the issue, but you need to remove it.

Since you have only es01-master as master-eligible, only this node should be listed in the discovery.seed_hosts setting.

Are you sure this isn't an issue with the monitoring? The self-monitoring you are using is deprecated, should not be used anymore and may have some issues.

Do you have any WARN or ERROR logs in your master log? The logs you shared indicate no issue.

What is the result of a _cluster/health request to you master node?

What do you have in the log of the other nodes?

I change these variables in three elasticsearch.yml:

cluster.initial_master_nodes: es01-master
discovery.seed_hosts: es01-master

but nothing changed.
I didn't configure metricbeat yet.

And there isn't any WARN or ERROR in logs.

stdout of cluster/health :

{
  "cluster_name": "naringames_elastic_cluster",
  "status": "green",
  "timed_out": false,
  "number_of_nodes": 3,
  "number_of_data_nodes": 2,
  "active_primary_shards": 375,
  "active_shards": 750,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 0,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 100
}

other node log:

es02-data  | {"@timestamp":"2023-06-14T00:28:39.819Z", "log.level": "WARN", "message":"this node is locked into cluster UUID [irImass0TMyQS-ofDmTgKw] but [cluster.initial_master_nodes] is set to [es01-master]; remove this setting to avoid possible data loss caused by subsequent cluster bootstrap attempts; for further information see https://www.elastic.co/guide/en/elasticsearch/reference/8.8/important-settings.html#initial_master_nodes", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[es02-data][scheduler][T#1]","log.logger":"org.elasticsearch.cluster.coordination.ClusterBootstrapService","elasticsearch.cluster.uuid":"irImass0TMyQS-ofDmTgKw","elasticsearch.node.id":"PvNPsykNTuCPFnkdOHF9Ig","elasticsearch.node.name":"es02-data","elasticsearch.cluster.name":"naringames_elastic_cluster"}
es02-data  | {"@timestamp":"2023-06-14T07:54:02.941Z", "log.level": "INFO", "message":"master node [{es01-master}{4OGHbn5iQH-tWnUe6KT6UA}{DYOu_W-CSM-Q8letxLLBlw}{es01-master}{172.22.0.3}{172.22.0.3:9300}{mr}{8.8.0}] disconnected, restarting discovery", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[es02-data][cluster_coordination][T#1]","log.logger":"org.elasticsearch.cluster.coordination.Coordinator","elasticsearch.cluster.uuid":"irImass0TMyQS-ofDmTgKw","elasticsearch.node.id":"PvNPsykNTuCPFnkdOHF9Ig","elasticsearch.node.name":"es02-data","elasticsearch.cluster.name":"naringames_elastic_cluster"}
es02-data  | {"@timestamp":"2023-06-14T07:55:15.981Z", "log.level": "WARN", "message":"this node is locked into cluster UUID [irImass0TMyQS-ofDmTgKw] but [cluster.initial_master_nodes] is set to [es01-master]; remove this setting to avoid possible data loss caused by subsequent cluster bootstrap attempts; for further information see https://www.elastic.co/guide/en/elasticsearch/reference/8.8/important-settings.html#initial_master_nodes", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.cluster.coordination.ClusterBootstrapService","elasticsearch.node.name":"es02-data","elasticsearch.cluster.name":"naringames_elastic_cluster"}
es02-data  | {"@timestamp":"2023-06-14T07:55:19.922Z", "log.level": "INFO", "message":"master node changed {previous [], current [{es01-master}{4OGHbn5iQH-tWnUe6KT6UA}{9UTrBDTjQMCCk9QEA8Ef8w}{es01-master}{172.22.0.3}{172.22.0.3:9300}{mr}{8.8.0}]}, added {{es01-master}{4OGHbn5iQH-tWnUe6KT6UA}{9UTrBDTjQMCCk9QEA8Ef8w}{es01-master}{172.22.0.3}{172.22.0.3:9300}{mr}{8.8.0}, {es03-data}{UpA2j6uARM2h10-6UVrCRg}{B-31WnipR1SWFt8xXypOSQ}{es03-data}{172.22.0.5}{172.22.0.5:9300}{d}{8.8.0}}, term: 3, version: 2077, reason: ApplyCommitRequest{term=3, version=2077, sourceNode={es01-master}{4OGHbn5iQH-tWnUe6KT6UA}{9UTrBDTjQMCCk9QEA8Ef8w}{es01-master}{172.22.0.3}{172.22.0.3:9300}{mr}{8.8.0}{xpack.installed=true}}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[es02-data][clusterApplierService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.service.ClusterApplierService","elasticsearch.node.name":"es02-data","elasticsearch.cluster.name":"naringames_elastic_cluster"}

@leandrojmp

Your cluster is green, this means that it is working without any issue, it also shows that you have 3 nodes in the cluster and 2 of them are data nodes, this match what you shared in your docker compose.

I see no issue with offline cluster, as I said this looks like an issue with monitoring.

Self-monitoring is deprecated and not recommended anymore, you should use metricbeat to monitor your cluster, but if I'm not wrong for self-monitoring to work you need this setting in every node: xpack.monitoring.collection.enabled: true, from what you shared you only have this for the master node.

This is a warning that happens on a restart, it means that you still have the setting cluster.initial_master_nodes in your elasticsearch.yml, this setting needs to be removed after the cluster is formed for the first time, remove it and the warn will disappear next time you restart your cluster.

@leandrojmp
"I'm grateful that I was finally able to enable online monitoring using xpack.monitoring.collection.enabled: true. I have another question. Could you please explain how the nodes function? For instance, if I run multiple pipelines with logstash, will the cluster distribute the workload across the nodes to alleviate pressure on the central server and prevent it from reaching its CPU capacity? Additionally, do you have any documentation available that covers the management of requests and queries within a cluster?"

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.