Configure cluster

Hi,
I created 3 vm machine with elasticsearch installation
I want to create cluster with one master and 2 data nodes
configure the yml as follow:
master node:
cluster.name: elastic
node.name: master_elk
node.master: true
node.data: false
network.host: the ip of the master machine
transport.host: localhost
transport.tcp.port: 9300
http.port: 9200
discovery.zen.ping.unicast.hosts: ["master IP", "data 1 ip", "data2 ip"]
discovery.zen.minimum_master_nodes: 1
data 1+2:
cluster.name: elasic
node.name: data1
node.master: true
node.data: true
network.host: the ip of the data machine
transport.host: localhost
transport.tcp.port: 9300
http.port: 9200
discovery.zen.ping.unicast.hosts: ["master IP", "data 1 ip", "data2 ip"]
discovery.zen.minimum_master_nodes: 1

when trying to telnet from the master to the data machine ip with port 9200 I get a connection but when trying with port 9300 I dont get connection
and when running health check to the cluster I get status red

Any help will be greate
Thanks,
Talia

forgot to add that the elasticsearch version I installed is 5.2

curl -XGET 'http://XXXX:9200/_nodes/transport?pretty'
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "elastic",
"nodes" : {
"H4pskB4fQOOmJT4m74goIg" : {
"name" : "master_elk",
"transport_address" : "127.0.0.1:9300",
"host" : "localhost",
"ip" : "127.0.0.1",
"version" : "5.2.0",
"build_hash" : "24e05b9",
"roles" : [
"master",
"ingest"
],
"transport" : {
"bound_address" : [
"127.0.0.1:9300",
"[::1]:9300"
],
"publish_address" : "127.0.0.1:9300",
"profiles" : { }
}
}
}
}

Just set network.host and you should be ok.

still not working

Logs?

when I just set the network.host without transport.host the elastic search is not started

in the log file I have:
[2017-03-05T16:00:43,997][INFO ][o.e.t.TransportService ] [master] publish_address {xxxx:9300}, bound_addresses {xxxx:9300}
[2017-03-05T16:00:44,004][INFO ][o.e.b.BootstrapChecks ] [master] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-03-05T16:00:44,006][ERROR][o.e.b.Bootstrap ] [master] node validation exception
bootstrap checks failed
system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk

This tells you exactly what the problem is. Earlier in your logs will be output telling you why the system call filters failed to install. Often it's due to your kernel not supporting them. You have two options: change to kernel that supports the seccomp features that are needed here, or disable system call filters at your own risk. This is covered in the docs.

OK, Thanks

I add just for the testing of discovery this line to yml file:
bootstrap.system_call_filter: false
now I have telnet with ports 9200 + 9300
but when I try to run the discovery I get timeout

curl -XGET 'machine_ip:9200/_cat/health?v&pretty'

try to add those line in the yml:
discovery.zen.join_timeout: 90s
discovery.zen.ping_timeout: 30s

still having timeout

{
"error" : {
"root_cause" : [
{
"type" : "master_not_discovered_exception",
"reason" : null
}
],
"type" : "master_not_discovered_exception",
"reason" : null
},
"status" : 503
}

in the log:
[WARN ][o.e.n.Node ] [master] timed out while waiting for initial discovery state - timeout: 30s
[INFO ][o.e.h.HttpServer ] [master] publish_address {machine ip:9200}, bound_addresses {machine ip:9200}
[INFO ][o.e.n.Node ] [master] started
[INFO ][o.e.d.z.ZenDiscovery ] [master] failed to send join request to master [{master}{H4pskB4fQOOmJT4m74goIg}{vpQbi0o7R_mog2jiRpv0yg}{machine ip}{machine ip:9300}], reason [RemoteTransportException[[master][machine ip:9300][internal:discovery/zen/join]]; nested: NotMasterException[Node [{master}{H4pskB4fQOOmJT4m74goIg}{mwU8gjwgQWiyRr_xG3FlHg}{machine ip}{machine ip:9300}] not master for join request]; ], tried [3] times
[2017-03-05T19:27:37,324][DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [master] no known master node, scheduling a retry
[2017-03-05T19:28:07,325][DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [master] timed out while retrying [cluster:monitor/health] after failure (timeout [30s])
[2017-03-05T19:28:07,326][WARN ][r.suppressed ] path: /_cat/health, params: {pretty=, v=}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:211) [elasticsearch-5.2.0.jar:5.2.0]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:307) [elasticsearch-5.2.0.jar:5.2.0]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:237) [elasticsearch-5.2.0.jar:5.2.0]
at org.elasticsearch.cluster.service.ClusterService$NotifyTimeout.run(ClusterService.java:1157) [elasticsearch-5.2.0.jar:5.2.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:527) [elasticsearch-5.2.0.jar:5.2.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]

Thanks for the replys

Your configuration above shows:

transport.host: localhost

for the master node. You need to bind both the master and the data node to non-loopback interfaces or they will not be able to connect with each other across your network.

change the yml, configure the transport.host: machine IP
in the log I cam see that :
bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks

but stiil when running the curl -XGET 'machine_ip:9200/_cat/health?v&pretty'
get the same errors in the log

Maybe there is something that I have to configure in the etc/hosts file?
What I missing?

Can you telnet from data node to master at port 9200/9300?
Also, since you have 3 mater nodes,
discovery.zen.minimum_master_nodes: 2

Thanks for the reply

telnet is working from all 3 servers

also checked and OK:
curl 10.15.20.10:9200
curl 10.15.20.11:9200
curl 10.15.20.12:9200

[elasticsearch@master ~]$ curl -XGET 'http://10.15.20.10:9200/_nodes/transport?pretty'
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "elkcl",
"nodes" : {
"H4pskB4fQOOmJT4m74goIg" : {
"name" : "master",
"transport_address" : "10.15.20.10:9300",
"host" : "10.15.20.10",
"ip" : "10.15.20.10",
"version" : "5.2.0",
"build_hash" : "24e05b9",
"roles" : [
"master",
"ingest"
],
"transport" : {
"bound_address" : [
"10.15.20.10:9300"
],
"publish_address" : "10.15.20.10:9300",
"profiles" : { }
}
}
}
}

[elasticsearch@ptktl-elkdev2 ~]$ curl -XGET 'http://10.15.20.11:9200/_nodes/transport?pretty'
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "elkcl",
"nodes" : {
"H4pskB4fQOOmJT4m74goIg" : {
"name" : "ptktl-elkdev2",
"transport_address" : "10.15.20.11:9300",
"host" : "10.15.20.11",
"ip" : "10.15.20.11",
"version" : "5.2.0",
"build_hash" : "24e05b9",
"roles" : [
"data",
"ingest"
],
"transport" : {
"bound_address" : [
"10.15.20.11:9300"
],
"publish_address" : "10.15.20.11:9300",
"profiles" : { }
}
}
}
}

[elasticsearch@ptktl-elkdev2 ~]$ curl -XGET 'http://10.15.20.12:9200/_nodes/transport?pretty'
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "elkcl",
"nodes" : {
"H4pskB4fQOOmJT4m74goIg" : {
"name" : "ptktl-elkdev3",
"transport_address" : "10.15.20.12:9300",
"host" : "10.15.20.12",
"ip" : "10.15.20.12",
"version" : "5.2.0",
"build_hash" : "24e05b9",
"roles" : [
"data",
"ingest"
],
"transport" : {
"bound_address" : [
"10.15.20.12:9300"
],
"publish_address" : "10.15.20.12:9300",
"profiles" : { }
}
}
}
}

I want to have one master and 2 data machine
configure the yml as follow:

master node:
cluster.name: elastic
node.name: master
node.master: true
node.data: false
bootstrap.system_call_filter: false
network.host: 10.15.20.10
transport.host: 10.15.20.10
transport.tcp.port: 9300
http.port: 9200
network.publish_host: 10.15.20.10
discovery.zen.ping.unicast.hosts: ["10.15.20.10:9300", "10.15.20.11:9300", "10.15.20.12:9300"]
discovery.zen.minimum_master_nodes: 1
discovery.zen.join_timeout: 90s
discovery.zen.ping_timeout: 90s

data 1+2:
cluster.name: elasic
node.name: data1
node.master: false
node.data: true
bootstrap.system_call_filter: false
transport.host: 10.15.20.11
transport.tcp.port: 9300
http.port: 9200
network.host: 10.15.20.11
network.publish_host: 10.15.20.11
discovery.zen.ping.unicast.hosts: ["10.15.20.10:9300", "10.15.20.11:9300", "10.15.20.12:9300"]
discovery.zen.minimum_master_nodes: 1
discovery.zen.join_timeout: 90s
discovery.zen.ping_timeout: 90s

[elasticsearch@master ~]$ curl http://10.15.20.10:9200/_cluster/health?pretty=true
{
"error" : {
"root_cause" : [
{
"type" : "master_not_discovered_exception",
"reason" : null
}
],
"type" : "master_not_discovered_exception",
"reason" : null
},
"status" : 503
}

in the log:
[INFO ][o.e.t.TransportService ] [master] publish_address {10.15.20.10:9300}, bound_addresses {10.15.20.10:9300}
[INFO ][o.e.b.BootstrapChecks ] [master] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[WARN ][o.e.n.Node ] [master] timed out while waiting for initial discovery state - timeout: 30s
[INFO ][o.e.h.HttpServer ] [master] publish_address {10.15.20.10:9200}, bound_addresses {10.15.20.10:9200}
[INFO ][o.e.n.Node ] [master] started
[DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [master] no known master node, scheduling a retry
[DEBUG][o.e.a.a.c.h.TransportClusterHealthAction] [master] timed out while retrying [cluster:monitor/health] after failure (timeout [30s])
[WARN ][r.suppressed ] path: /_cluster/health, params: {pretty=true}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:211) [elasticsearch-5.2.0.jar:5.2.0]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:307) [elasticsearch-5.2.0.jar:5.2.0]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:237) [elasticsearch-5.2.0.jar:5.2.0]
at org.elasticsearch.cluster.service.ClusterService$NotifyTimeout.run(ClusterService.java:1157) [elasticsearch-5.2.0.jar:5.2.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:527) [elasticsearch-5.2.0.jar:5.2.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]
[INFO ][o.e.d.z.ZenDiscovery ] [master] failed to send join request to master [{master}{H4pskB4fQOOmJT4m74goIg}{zEXhVspzSuy69JgbEZbRoQ}{10.15.20.10}{10.15.20.10:9300}], reason [RemoteTransportException[[master][10.15.20.10:9300][internal:discovery/zen/join]]; nested: NotMasterException[Node [{master}{H4pskB4fQOOmJT4m74goIg}{Y0Bi0UqcT-e0mHFC5OYasg}{10.15.20.10}{10.15.20.10:9300}] not master for join request]; ], tried [3] times

Any suggestions?
Also if I change the discovery.zen.minimum_master_nodes: 2 its still not working

Please advise

Hi,
Now I have in the log file at the data servers:
found existing node with the same id but is a different node instance

While checking this error I found out that:
"I think that you copied the data folder from one to the other. In particular, this means that the node ID was copied along with it, and we do not allow two nodes with the same ID to join the cluster."

Maybe this is my problem?

How can I check the node ID?
How can I change the ID if its the same?

see above reply I upload the outpout of:
curl -XGET 'http://10.15.20.12:9200/_nodes/transport?pretty' from each machine
Does the node id is:
"nodes" : {
"H4pskB4fQOOmJT4m74goIg"
If so how can I change it?

Please advice
Thanks,
Talia

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.