Concept of Elasticsearch cluster

Hi @pyerunka it looks like the nodes have switched roles at some point. Let's work with the new roles:

My master node is on [xyz.com](http://xyz.com/) and data node is on [abcd.com](http://abcd.com/).

Let's adapt those configs:

Please can you:

  1. Stop both nodes (again)
  2. On both nodes, delete the entire "./data" folder, so we start up clean
  3. Edit your Master node config on [xyz.com] to just leave the following parameters, please delete the other parameters:
cluster.name: Cluster
node.name: Master-node
node.master: true
network.host: _global_
cluster.initial_master_nodes: ["Master-node"]
  1. edit your Data node config on [abcd.com] to delete all extra configuration parameters and just leave the following:
cluster.name: Cluster
node.name: Data-node
node.master: false
network.host: _global_
discovery.seed_hosts: ["xyz.com"]
  1. Start Master-node on server xyz.com and wait until the console logging messages show a message regarding

"Cluster health status"

  1. Make the following request from both servers, either from command line or from a browser

$ curl -X GET http://xyz.com:9200

You should get some JSON back that looks something like this:

{
  "name" : "node1",
  "cluster_name" : "my_cluster",
  "cluster_uuid" : "5F6fjOcaRXCRTyNFFKvXXQ",
  "version" : {
    "number" : "7.3.1",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "4749ba6",
    "build_date" : "2019-08-19T20:19:25.651794Z",
    "build_snapshot" : false,
    "lucene_version" : "8.1.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

If you DON'T get JSON here, then stop & post the response here

If you DO get JSON then proceed to start the Data-node on the other server abcd.com

Then please post the console output from the Data-node here

Hello @Dominic_Page,

I have done this changes on both the nodes.
After starting master node with following configuration i am getting below error while initiating ES service.

Config file:
cluster.name: cluster
node.name: Master-node
node.master: true
network.host: _global_
cluster.initial_master_nodes: ["xyz.com"]

Error:
[2019-10-21T06:31:00,642][WARN ][o.e.c.c.ClusterFormationFailureHelper] [Master-node] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [xyz.com] to bootstrap a cluster: have discovered []; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9301, [::1]:9302, [::1]:9303, [::1]:9304, [::1]:9305] from hosts providers and [{Master-node}{Uedn3P1UTK2pNassVMW_-g}{BpbEXfRDSX2Ut7HDKBQoMw}{136.252.129.115}{136.252.129.115:9300}{ml.machine_memory=51539005440, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0

Regards,
Priyanka

Hi @pyerunka

In your Master node configuration you have set

cluster.initial_master_nodes: ["xyz.com"]

  1. Please could you change this to

cluster.initial_master_nodes: ["Master-node"]

And try starting the Master-node again. It looks like your local/global DNS is not resolving [xyz.com], however that doesn't have to be a problem because you can use the Node name in that configuration, see: https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-discovery-bootstrap-cluster.html#modules-discovery-bootstrap-cluster-fqdns

If your Master-node starts up OK, we can take that as confirmation that the DNS resolution has an issue. To prevent that issue affecting the Data-node aswell, I would suggest you:

  1. Find the IP of the [xyz.com]
  2. Check you can ping that IP from [abc.com] - if you can't ping the IP you will have to resolve that before you proceed to step 4.
  3. In the Data-node config, change [xyz.com] to that IP; so that the Data-node can find the Master-node when it starts up

Hi @pyerunka

Please can you attach both config files and the console output from the master node

Hello @Dominic_Page,

Master node config file:
cluster.name: InTouch
node.name: Master-node
node.master: true
network.host: _global_
cluster.initial_master_nodes: ["Master-node"]

 Data node config file:
cluster.name: InTouch
node.name: Data-node
node.master: false
network.host: _global_
discovery.seed_hosts: ["ipaddress_of_master_node"]

 And console output from the master node:
 [2019-10-21T09:22:48,126][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [rank-eval]
 [2019-10-21T09:22:48,126][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [reindex]
[2019-10-21T09:22:48,127][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [repository-url]
[2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [transport-netty4]
 [2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-ccr]
[2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-core]
[2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-deprecation]
 [2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-graph]
[2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-ilm]
[2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-logstash]
[2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-ml]
 [2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-monitoring]
[2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-rollup]
 [2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-security]
 [2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-sql]
[2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] loaded module [x-pack-watcher]
[2019-10-21T09:22:48,128][INFO ][o.e.p.PluginsService     ] [Master-node] no plugins loaded
 [2019-10-21T09:22:55,170][INFO ][o.e.x.s.a.s.FileRolesStore] [Master-node] parsed [0] roles from file [E:\ES\elasticsearch-7.2.0-windows-x86_64\elasticsearch-7.2.0\config\roles.yml]
[2019-10-21T09:23:03,296][DEBUG][o.e.a.ActionModule       ] [Master-node] Using REST wrapper from plugin org.elasticsearch.xpack.security.Security
 [2019-10-21T09:23:03,319][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [Master-node] [controller/12256] [Main.cc@110] controller (64 bit): Version 7.2.0 (Build 65aefcbfce449b) Copyright (c) 2019 Elasticsearch BV
 [2019-10-21T09:23:03,867][INFO ][o.e.d.DiscoveryModule    ] [Master-node] using discovery type [zen] and seed hosts providers [settings]
 [2019-10-21T09:23:05,726][INFO ][o.e.n.Node  ] [Master-node] initialized
 [2019-10-21T09:23:05,742][INFO ][o.e.n.Node ] [Master-node] starting ...
 [2019-10-21T09:23:06,070][INFO ][o.e.t.TransportService   ] [Master-node] publish_address {136.252.129.115:9300}, bound_addresses {136.252.129.115:9300}
[2019-10-21T09:23:06,086][INFO ][o.e.b.BootstrapChecks    ] [Master-node] bound or publishing to a non-loopback address, enforcing bootstrap checks
 [2019-10-21T09:23:06,101][INFO ][o.e.c.c.Coordinator      ] [Master-node] setting initial configuration to VotingConfiguration{NNUY-jVERiqCVCZ8Dk59Hw}
[2019-10-21T09:23:06,242][INFO ][o.e.c.s.MasterService    ] [Master-node] elected-as-master ([1] nodes joined)[{Master-node}{NNUY-jVERiqCVCZ8Dk59Hw}{y5FDFvRCToWJyo5QTBJXGQ}{136.252.129.115}{136.252.129.115:9300}{ml.machine_memory=51539005440, xpack.installed=true, ml.max_open_jobs=20} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 1, version: 1, reason: master node changed {previous [], current [{Master-node}{NNUY-jVERiqCVCZ8Dk59Hw}{y5FDFvRCToWJyo5QTBJXGQ}{136.252.129.115}{136.252.129.115:9300}{ml.machine_memory=51539005440, xpack.installed=true, ml.max_open_jobs=20}]}
 [2019-10-21T09:23:06,289][INFO ][o.e.c.c.CoordinationState] [Master-node] cluster UUID set to [yQvifHj-Tw2RNMCItAT7SA]
    [2019-10-21T09:23:06,336][INFO ][o.e.c.s.ClusterApplierService] [Master-node] master node changed {previous [], current [{Master-node}{NNUY-jVERiqCVCZ8Dk59Hw}{y5FDFvRCToWJyo5QTBJXGQ}{136.252.129.115}{136.252.129.115:9300}{ml.machine_memory=51539005440, xpack.installed=true, ml.max_open_jobs=20}]}, term: 1, version: 1, reason: Publication{term=1, version=1}
[2019-10-21T09:23:06,492][INFO ][o.e.h.AbstractHttpServerTransport] [Master-node] publish_address {136.252.129.115:9200}, bound_addresses {136.252.129.115:9200}
 [2019-10-21T09:23:06,523][INFO ][o.e.n.Node               ] [Master-node] started
 [2019-10-21T09:23:06,726][INFO ][o.e.g.GatewayService     ] [Master-node] recovered [0] indices into cluster_state
 [2019-10-21T09:23:06,976][INFO ][o.e.c.m.MetaDataIndexTemplateService] [Master-node] adding template [.triggered_watches] for index patterns [.triggered_watches*]
[2019-10-21T09:23:07,367][INFO ][o.e.c.m.MetaDataIndexTemplateService] [Master-node] adding template [.watches] for index patterns [.watches*]
[2019-10-21T09:23:07,508][INFO ][o.e.c.m.MetaDataIndexTemplateService] [Master-node] adding template [.watch-history-9] for index patterns [.watcher-history-9*]
[2019-10-21T09:23:07,555][INFO ][o.e.c.m.MetaDataIndexTemplateService] [Master-node] adding template [.monitoring-logstash] for index patterns [.monitoring-logstash-7-*]
 [2019-10-21T09:23:07,617][INFO ][o.e.c.m.MetaDataIndexTemplateService] [Master-node] adding template [.monitoring-es] for index patterns [.monitoring-es-7-*]
 [2019-10-21T09:23:08,028][INFO ][o.e.c.m.MetaDataIndexTemplateService] [Master-node] adding template [.monitoring-beats] for index patterns [.monitoring-beats-7-*]
[2019-10-21T09:23:08,106][INFO ][o.e.c.m.MetaDataIndexTemplateService] [Master-node] adding template [.monitoring-alerts-7] for index patterns [.monitoring-alerts-7]
 [2019-10-21T09:23:08,153][INFO ][o.e.c.m.MetaDataIndexTemplateService] [Master-node] adding template [.monitoring-kibana] for index patterns [.monitoring-kibana-7-*]
 [2019-10-21T09:23:08,184][INFO ][o.e.x.i.a.TransportPutLifecycleAction] [Master-node] adding index lifecycle policy [watch-history-ilm-policy]
[2019-10-21T09:23:08,293][INFO ][o.e.l.LicenseService     ] [Master-node] license [83f56876-00b9-4791-9fbc-2f0f605db5e5] mode [basic] - valid

Regards,
Priyanka

Hi @pyerunka

The Console Log shows your Master-node has now successfully elected itself master

[Master-node] publish_address {XXX.XXX.XXX.XXX:9200}

Also, this shows the IP address where the Data-node needs to look for the Master.

Stop your data node and edit your Data-node config, changing:

discovery.seed_hosts: ["ipaddress_of_master_node"]

to

discovery.seed_hosts: ["XXX.XXX.XXX.XXX"]

You don't need to append the port as the Master-node will be using the default.

Then, start the Data-node again.

Please post your console log from the Data-node

Hi @pyerunka

What response do you receive when you run the following command from the command line on the data node:

curl -X GET XXX.XXX.XXX.XXX:9300

?

Hello @Dominic_Page,

Thanks for reply.
I am getting following response:
Failed to connect to ip port 9300: Timed out

Regards,
Priyanka

Hi @pyerunka

The console log from the Master node
Has confirmed that the Master node
Is listening on that ip:port

You shou!d get a message from the curl
Request saying that it is not a http port
(Assuming the Master node is running)
I think your configs look OK so I suggest
You leave them as they are and check
The firewall permits UDP traffic on port
9300 between those two IPs

Apologies for capitalisation, on mobile
Device

Hello @Dominic_Page,

Thanks for your reply!!!
Firewall rule has been already implemented between 2 host servers on port 9300 and 9400.

Regards,
Priyanka

Hi @pyerunka

Is the master node definitely running when you make the Curl request?

Hello @Dominic_Page,

Yes it is giving me response. I am getting below response.

{
  "name" : "Master-node",
  "cluster_name" : "Cluster",
  "cluster_uuid" : "yQvifHj-Tw2RNMCItAT7SA",
  "version" : {
    "number" : "7.2.0",
    "build_flavor" : "default",
    "build_type" : "zip",
    "build_hash" : "508c38a",
    "build_date" : "2019-06-20T15:54:18.811730Z",
    "build_snapshot" : false,
    "lucene_version" : "8.0.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
} 

Regards,
Priyanka

Hi @pyerunka

Do you get that same response when you make the curls request from the data node on port 9200 (the HTTP port, rather than 9300 the UDP port)?

curl -X GET XXX.XXX.XXX.XXX:9200

If so, you will need to request that the firewall rule for 9300 allows UDP traffic

Hello @Dominic_Page,

I have run curls request from the data node on port 9200. I am getting below response:

{
  "name" : "Master-node",
  "cluster_name" : "Cluster",
  "cluster_uuid" : "yQvifHj-Tw2RNMCItAT7SA",
  "version" : {
    "number" : "7.2.0",
    "build_flavor" : "default",
    "build_type" : "zip",
    "build_hash" : "508c38a",
    "build_date" : "2019-06-20T15:54:18.811730Z",
    "build_snapshot" : false,
    "lucene_version" : "8.0.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

And firewall rule has been already implemented on port 9300.
Can you explain what is UDP? and is this different than opening a firewall?

Regards,
Priyanka

Hi @pyerunka

UDP and TCP are two different protocols for messaging

The firewall needs to open:

9200 as TCP
9300 as UDP

The TCP messages on 9200 are for API calls, curl, Kibana etc.
The UDP messages on 9300 are the inter-nodes messages within the cluster

By default, your firewall may have been opened for TCP but not for UDP messages, which might explain why the Data-node has not (Yet) been able to connect on 9300

Setup your firewall to permit UDP on 9300 (and keep TCP on 9200)

For more detail on UDP vs TCP see: https://en.wikipedia.org/wiki/User_Datagram_Protocol#Comparison_of_UDP_and_TCP

Hello @Dominic_Page,

Thanks for reply!!
I will share this with our network team.

Regards,
Priyanka

1 Like

Hi @pyerunka

Is your data node able to connect now?

Hello @Dominic_Page,

I have raised request to open port 9300 for UDP messages to our network team.
I will let you know once the request is done.

Regards,
Priyanka

Hello @Dominic_Page,

Fireflow request has been implemented to open a port between 9300 for UDP messages.
After this i have tried to start ES service on data node, still i am getting same error:

2019-10-30T10:45:37,811][WARN ][o.e.c.c.ClusterFormationFailureHelper] [Data-node] master not discovered yet: have discovered ; discovery will continue using [ipaddress_of_master_node:9300] from hosts providers and [{Data-node}{YehkWHtkTgOIuCz-FkNmnQ}{9H-sEJFLT52813lojAeZAg}{ipaddress_of_data_node}{ipaddress_of _data_node:9300}{ml.machine_memory=51539005440, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0

Regards,
Priyanka Yerunkar

Hi @pyerunka

Now that the Firewall rule has been implemented and assuming the Master node is running, what responses do you receive when you send the following individual requests from the command line on the Data node:

curl -X GET XXX.XXX.XXX.XXX:9200

curl -X GET XXX.XXX.XXX.XXX:9300

?