I have a small ES 6.5.4 cluster (3 nodes) I'm trying to get going in a google cloud project.
I have the latest gce - discovery plugin installed (6.5.4), however it seems like it starts and I never see anything in the logs related to the gce discovery service, besides that it's been started and "timed out while waiting for initial discovery state - timeout: 30s". Then the slew of "Not enough master nodes discovered during pinging" messages from the ZenDiscovery continue. Thus, the nodes are never able to talk to each other. I also tried adding the additional trace logging levels for the discovery plugin but to no avail. I don't know if it's actually working the way it's supposed to, I'm guessing this just starts when ES process is started.
The one time I was able to get the nodes to communicate with each other was by setting the "discovery.zen.ping.unicast.hosts:" on each node and plugging in my array of IPs manually. Unfortunately this method won't work for my situation as I need it to be more dynamic.
Here's a copy of my elasticsearch config from one of my nodes (I'm working off a config upgrading from 5.6, so I've commented out unnecessary items)
cluster.name: es-65x-development
node.name: es-65x-development-node-ntkc
node.master: true
node.data: true
node.ingest: true
# search.remote.connect: false -- deprecated
node.max_local_storage_nodes: 1
discovery.zen.minimum_master_nodes: 2
path:
data: /var/lib/elasticsearch
logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: _gce_
# network.bind_host: 10.24.8.57 defaults to network.host
# network.publish_host: 10.24.8.57
# transport.tcp.port: 9300-9400 -- defaults to 9300-9400
# transport.tcp.compress: false
# http.port: 9200-9300 -- defaults to 9200-9300
http.max_content_length: 100mb
# http.enabled: true --deprecated
http.cors.enabled: false
monitor.jvm.gc.overhead.warn: 100
monitor.jvm.gc.overhead.info: 50
monitor.jvm.gc.overhead.debug: 20
script.allowed_types: inline
script.allowed_contexts: search, update
cloud:
gce:
project_id: xyz-development
zone: us-central1-b
discovery:
zen.hosts_provider: gce
gce:
tags: es-65x-development
I would like to note that I am able to telnet successfully to each node from one another using the ip and port 9200.
If I try to curl the ip that 9200 is bound to on each of my nodes it returns something like below:
> { > "name" : "es-65x-development-node-w9b9", > "cluster_name" : "es-65x-development", > "cluster_uuid" : "_na_", > "version" : { > "number" : "6.5.4", > "build_flavor" : "default", > "build_type" : "deb", > "build_hash" : "d2ef93d", > "build_date" : "2018-12-17T21:17:40.758843Z", > "build_snapshot" : false, > "lucene_version" : "7.5.0", > "minimum_wire_compatibility_version" : "5.6.0", > "minimum_index_compatibility_version" : "5.0.0" >
If I do a
netstat -tuple
I can see elasttic search listening on 9200 and 9300 (both tcp6, not sure if that matters)
I'm also able to verify 9200 and 9300 are open from a different node via nmap, as well as confirming there is no firewall enabled on each node with a
sudo ufw status
showing that it is inactive.
Again, these were able to communicate when I set the IPs of the discovery zen hosts manually, but I need to be able to use the gce discovery tool.
Let me know if there's any more information I can provide. I tried searching for this error but I can't seem to find anything that applies to my situation or my version of Elasticsearch, in conjunction with the gce discovery tool.
Thanks in advance!
Sample of logs from one of my nodes:
-01-19T15:12:53,345][WARN ][o.e.d.z.ZenDiscovery ] [es-65x-development-node-w9b9] not enough master nodes discovered during pinging(found [[Candidate{node={es-65x-development-node-w9b9}{w60MIXdGR0m1yFNoVfYbxA}{GXnxS1CYRwuPasmYgOEdAQ}{10.24.8.58}{10.24.8.58:9300}{ml.machine_memory=3877261312, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2019-01-19T15:12:56,529][WARN ][o.e.d.z.ZenDiscovery ] [es-65x-development-node-w9b9] not enough master nodes discovered during pinging(found [[Candidate{node={es-65x-development-node-w9b9}{w60MIXdGR0m1yFNoVfYbxA}{GXnxS1CYRwuPasmYgOEdAQ}{10.24.8.58}{10.24.8.58:9300}{ml.machine_memory=3877261312, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2019-01-19T15:12:59,691][WARN ][o.e.d.z.ZenDiscovery ] [es-65x-development-node-w9b9] not enough master nodes discovered during pinging(found [[Candidate{node={es-65x-development-node-w9b9}{w60MIXdGR0m1yFNoVfYbxA}{GXnxS1CYRwuPasmYgOEdAQ}{10.24.8.58}{10.24.8.58:9300}{ml.machine_memory=3877261312, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, clusterStateVersion=-1}]], but needed [2]), pinging again