Elasticsearch remains unhealthy because master node or Node is not able to find master node


(Kaushal Pandey) #1

Elasticsearch Version 5.5.0
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

OS Version- Linux version 2.6.32-642.11.1.el6.x86_64

Problem- Not able to configure 2 Node(1 Master and

Node is not able to find Master node and throwing error.
"[Node_2] not enough master nodes discovered during pinging (found [[]], but needed [2]), pinging again"

Current Setting.
Master Node setting- /etc/elasticsearch/elasticsearch.yml

cluster.name: search-dev
node.name: Master_Node
discovery.zen.minimum_master_nodes: 1
node.master: true
node.data: true
transport.host: localhost
network.bind_host: 11.123.123.11
http.port: 9200
http.cors.enabled : true
http.cors.allow-origin : "*"
http.cors.allow-methods : OPTIONS, HEAD, GET, POST, PUT, DELETE
http.cors.allow-headers : X-Requested-With,X-Auth-Token,Content-Type, Content-Length

http://myserver:9200/_cluster/health?pretty
Response-
{
"cluster_name" : "search-dev",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 11,
"active_shards" : 11,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 11,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 50.0
}

2 Node setting- /etc/elasticsearch/elasticsearch.yml

cluster.name: search-dev
node.name: Node1
discovery.zen.ping.unicast.hosts: ["11.123.123.11:9300"]
discovery.zen.minimum_master_nodes: 2
node.master: false
node.data: true
transport.host: localhost
transport.tcp.port: 9300
http.port: 9200
http.cors.enabled : true
http.cors.allow-origin : "*"
http.cors.allow-methods : OPTIONS, HEAD, GET, POST, PUT, DELETE
http.cors.allow-headers : X-Requested-With,X-Auth-Token,Content-Type, Content-Length
network.bind_host: 22.222.22.222

http://myserver:9200/_cluster/health?pretty
Response-
{
"error" : {
"root_cause" : [
{
"type" : "master_not_discovered_exception",
"reason" : null
}
],
"type" : "master_not_discovered_exception",
"reason" : null
},
"status" : 503
}

Still Node is not able to find master.


(Christian Dahlqvist) #2

Why have you configured transport.host to localhost? Are the nodes on the same server?


(Kaushal Pandey) #3

Hi Chris, no they are in different VM.


(Christian Dahlqvist) #4

Then I believe you need to change this to the IP of the VM, as they otherwise will not be able to reach each other. You can test if you have connectivity by connecting to port 9300 on the other VM via e.g. telnet.


(Kaushal Pandey) #5

changing to IP cause different error.

[root@vc2crtp2277613n a367949]# tail -f /var/log/elasticsearch/learningsearch-dev.log
[2017-07-12T09:14:44,759][INFO ][o.e.t.TransportService ] [ES_LDIT_Data] publish_address {11.123.45.679:9300}, bound_addresses {11.123.45.679:9300}
[2017-07-12T09:14:44,774][INFO ][o.e.b.BootstrapChecks ] [ES_LDIT_Data] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-07-12T09:14:44,777][ERROR][o.e.b.Bootstrap ] [ES_LDIT_Data] node validation exception
[2] bootstrap checks failed
[1]: max number of threads [1024] for user [elasticsearch] is too low, increase to at least [2048]
[2]: system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
[2017-07-12T09:14:44,779][INFO ][o.e.n.Node ] [ES_LDIT_Data] stopping ...
[2017-07-12T09:14:44,860][INFO ][o.e.n.Node ] [ES_LDIT_Data] stopped
[2017-07-12T09:14:44,861][INFO ][o.e.n.Node ] [ES_LDIT_Data] closing ...
[2017-07-12T09:14:44,874][INFO ][o.e.n.Node ] [ES_LDIT_Data] closed

already try to change ulimit, doesnt help


(Christian Dahlqvist) #6

Yes, that is correct. You can read about bootstrap checks and how to resolve them here. They are also described in this blog post.


(Kaushal Pandey) #7

Hi Chris,
I have done all the setting as suggested, still not able to connect with Master Node. Here is new yml information
Master Node Yml
cluster.name: search-dev
node.name: Master_Node
discovery.zen.minimum_master_nodes: 1
node.master: true
node.data: true
transport.host: localhost
network.host: 11.111.111.10
http.port: 9200
http.cors.enabled : true
http.cors.allow-origin : "*"
http.cors.allow-methods : OPTIONS, HEAD, GET, POST, PUT, DELETE
http.cors.allow-headers : X-Requested-With,X-Auth-Token,Content-Type, Content-Length
bootstrap.memory_lock: true
path.logs: /apps/ES/logs
path.data: /apps/ES/logs
discovery.zen.ping.unicast.hosts: ["11.111.111.10","22.222.22.222"]
transport.tcp.port: 9300

Node 1 Yml.
cluster.name: search-dev
node.name: Node1
discovery.zen.minimum_master_nodes: 2
node.master: false
node.data: true
transport.host: localhost
network.host: 22.222.22.222
http.port: 9200
http.cors.enabled : true
http.cors.allow-origin : "*"
http.cors.allow-methods : OPTIONS, HEAD, GET, POST, PUT, DELETE
http.cors.allow-headers : X-Requested-With,X-Auth-Token,Content-Type, Content-Length
bootstrap.memory_lock: true
path.logs: /apps/ES/logs
path.data: /apps/ES/logs
discovery.zen.ping.unicast.hosts: ["11.111.111.10","22.222.22.222"]
transport.tcp.port: 9300

Request-
[root@vc2crtp2277613n a367949]# curl -XGET '22.222.22.222:9200/_nodes?filter_path=**.mlockall&pretty'
{
"nodes" : {
"dbjkJu7xQhuTC0qU3ipzSA" : {
"process" : {
"mlockall" : true
}
}
}
}

Still error
[2017-07-12T11:32:44,483][INFO ][o.e.t.TransportService ] [Node1] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}
[2017-07-12T11:32:44,499][WARN ][o.e.b.BootstrapChecks ] [Node1] max number of threads [1024] for user [elasticsearch] is too low, increase to at least [2048]
[2017-07-12T11:32:44,500][WARN ][o.e.b.BootstrapChecks ] [Node1] system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
[2017-07-12T11:32:47,540][WARN ][o.e.d.z.ZenDiscovery ] [Node1] not enough master nodes discovered during pinging (found [[]], but needed [2]), pinging again
[2017-07-12T11:32:50,542][WARN ][o.e.d.z.ZenDiscovery ] [Node1] not enough master nodes discovered during pinging (found [[]], but needed [2]), pinging again
[2017-07-12T11:32:53,543][WARN ][o.e.d.z.ZenDiscovery ] [Node1] not enough master nodes discovered during pinging (found [[]], but needed [2]), pinging again
[2017-07-12T11:32:56,545][WARN ][o.e.d.z.ZenDiscovery ] [Node1] not enough master nodes discovered during pinging (found [[]], but needed [2]), pinging again


(Christian Dahlqvist) #8

You still have transport.host set to localhost....


(Kaushal Pandey) #9

Hi Chris,
after changing transport.host: localhost to transport.host: servername it start throwing error.

[root@myserver a367949]# tail -f /apps/ES/logs/search-dev.log
[2017-07-12T12:28:00,357][INFO ][o.e.t.TransportService ] [Master_Node] publish_address {11.111.111.10:9300}, bound_addresses {11.111.111.10:9300}
[2017-07-12T12:28:00,376][INFO ][o.e.b.BootstrapChecks ] [Master_Node] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-07-12T12:28:00,380][ERROR][o.e.b.Bootstrap ] [Master_Node] node validation exception
[2] bootstrap checks failed
[1]: max number of threads [1024] for user [elasticsearch] is too low, increase to at least [2048]
[2]: system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
[2017-07-12T12:28:00,389][INFO ][o.e.n.Node ] [Master_Node] stopping ...
[2017-07-12T12:28:00,439][INFO ][o.e.n.Node ] [Master_Node] stopped
[2017-07-12T12:28:00,439][INFO ][o.e.n.Node ] [Master_Node] closing ...
[2017-07-12T12:28:00,455][INFO ][o.e.n.Node ] [Master_Node] closed
^C
[root@vc2crtp1261413n a367949]# ulimit
unlimited

Not sure why Transport.host is related to this.


(Christian Dahlqvist) #10

Please read the links I provided earlier about bootstrap checks. The error message described quite well what is wrong in my opinion.


(Kaushal Pandey) #11

Hi Chris,
Based on document i performed all the change.

vi /etc/security/limits.conf
elasticsearch - nofile 65536

vi /usr/lib/systemd/system/elasticsearch.service
[Service]
LimitMEMLOCK=infinity

vi /etc/elasticsearch/jvm.options
export ES_JAVA_OPTS="$ES_JAVA_OPTS -Djava.io.tmpdir=/apps/ES/logs"
./bin/elasticsearch

Still same issue i am facing

[2017-07-14T11:42:54,944][INFO ][o.e.n.Node ] [master-node-1] initializing ...
[2017-07-14T11:42:55,045][INFO ][o.e.e.NodeEnvironment ] [master-node-1] using [1] data paths, mounts [[/apps (/dev/xvdb1)]], net usable_space [43.4gb], net total_space [46.8gb], spins? [no], types [ext4]
[2017-07-14T11:42:55,045][INFO ][o.e.e.NodeEnvironment ] [master-node-1] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-07-14T11:42:55,048][INFO ][o.e.n.Node ] [master-node-1] node name [master-node-1], node ID [tUbqcE97RLiGtdC-z9PmWw]
[2017-07-14T11:42:55,049][INFO ][o.e.n.Node ] [master-node-1] version[5.5.0], pid[8178], build[260387d/2017-06-30T23:16:05.735Z], OS[Linux/2.6.32-642.11.1.el6.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_74/25.74-b02]
[2017-07-14T11:42:55,049][INFO ][o.e.n.Node ] [master-node-1] JVM arguments [-Xms2g, -Xmx2g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+DisableExplicitGC, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djdk.io.permissionsUseCanonicalPath=true, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j.skipJansi=true, -XX:+HeapDumpOnOutOfMemoryError, -Des.path.home=/usr/share/elasticsearch]
[2017-07-14T11:42:55,987][INFO ][o.e.p.PluginsService ] [master-node-1] loaded module [aggs-matrix-stats]
[2017-07-14T11:42:55,987][INFO ][o.e.p.PluginsService ] [master-node-1] loaded module [ingest-common]
[2017-07-14T11:42:55,987][INFO ][o.e.p.PluginsService ] [master-node-1] loaded module [lang-expression]
[2017-07-14T11:42:55,987][INFO ][o.e.p.PluginsService ] [master-node-1] loaded module [lang-groovy]
[2017-07-14T11:42:55,987][INFO ][o.e.p.PluginsService ] [master-node-1] loaded module [lang-mustache]
[2017-07-14T11:42:55,987][INFO ][o.e.p.PluginsService ] [master-node-1] loaded module [lang-painless]
[2017-07-14T11:42:55,987][INFO ][o.e.p.PluginsService ] [master-node-1] loaded module [parent-join]
[2017-07-14T11:42:55,987][INFO ][o.e.p.PluginsService ] [master-node-1] loaded module [percolator]
[2017-07-14T11:42:55,987][INFO ][o.e.p.PluginsService ] [master-node-1] loaded module [reindex]
[2017-07-14T11:42:55,987][INFO ][o.e.p.PluginsService ] [master-node-1] loaded module [transport-netty3]
[2017-07-14T11:42:55,987][INFO ][o.e.p.PluginsService ] [master-node-1] loaded module [transport-netty4]
[2017-07-14T11:42:55,988][INFO ][o.e.p.PluginsService ] [master-node-1] no plugins loaded
[2017-07-14T11:42:58,215][INFO ][o.e.d.DiscoveryModule ] [master-node-1] using discovery type [zen]
[2017-07-14T11:42:58,837][INFO ][o.e.n.Node ] [master-node-1] initialized
[2017-07-14T11:42:58,837][INFO ][o.e.n.Node ] [master-node-1] starting ...
[2017-07-14T11:42:59,001][INFO ][o.e.t.TransportService ] [master-node-1] publish_address {11.111.111.10:9300}, bound_addresses {11.111.111.10:9300}
[2017-07-14T11:42:59,021][INFO ][o.e.b.BootstrapChecks ] [master-node-1] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-07-14T11:42:59,027][ERROR][o.e.b.Bootstrap ] [master-node-1] node validation exception
[2] bootstrap checks failed
[1]: max number of threads [1024] for user [elasticsearch] is too low, increase to at least [2048]
[2]: system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
[2017-07-14T11:42:59,029][INFO ][o.e.n.Node ] [master-node-1] stopping ...
[2017-07-14T11:42:59,070][INFO ][o.e.n.Node ] [master-node-1] stopped
[2017-07-14T11:42:59,070][INFO ][o.e.n.Node ] [master-node-1] closing ...
[2017-07-14T11:42:59,090][INFO ][o.e.n.Node ] [master-node-1] closed

path.logs: /apps/ES/logs
path.data: /apps/ES/logs
cluster.name: learningsearch-dev
node.name: Master_Node
bootstrap.memory_lock: true
network.bind_host: 11.111.111.10
discovery.zen.ping.unicast.hosts: ["11.111.111.10","22.222.22.222"]
discovery.zen.minimum_master_nodes: 1
node.master: true
node.data: true
transport.host: 10.111.111.10
http.port: 9200
http.cors.enabled : true
http.cors.allow-origin : "*"
http.cors.allow-methods : OPTIONS, HEAD, GET, POST, PUT, DELETE
http.cors.allow-headers : X-Requested-With,X-Auth-Token,Content-Type, Content-Length
transport.tcp.port: 9300


(Kaushal Pandey) #12

i have tried all the option and all the steps, still issue persist. Can some one please look into this.


(Christian Dahlqvist) #13

According to the log, it seems you need to increase the number of available threads as well as the system call filters. What does the settings relevant to your user look like in /etc/security/limits.conf?


(system) #14

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.