Minimum nodes for elasticsearch 7.11.2 cluster?

Hello,
I'm using standalone ES node at the moment. I would like to add another node in another datacenter to increase data resilience and use cluster functions.

Is it possible with just another node? Any documentation to install secondary node?

Thanks!

Hi @webfr
You should not add nodes across data centers.
The exception is across zones in a public Cloud provider such as AWS, GCP or Azure which have very strict low latency networks within a single region. (less than 10 ms)
if you try to construct a cluster with nodes spread across normal slef managed data centers, the cluster will most likely be unstable and segment / split into separate clusters.

And in general 3 nodes is a better number of nodes than 2 from a cluster resiliency perspective.

1 Like

The manual has fairly comprehensive instructions on how to set up a resilient cluster.

That doesn't sound right to me. The usual consequence of high latency is poor performance. The cluster should still be stable even with multiple seconds of latency, although it will be pretty much unusable at that level. Under no network conditions will it split into separate clusters.

2 Likes

Thanks, I'm in test environnement, what about 2 nodes, will it work anyway or ES 7 will force you to have minimum 3? Which roles to assign?

Thanks :slightly_smiling_face:

Perhaps I am using the incorrect terms unstable or unusable tend to result in the same result... Which is poor or unusable cluster that is not resilient / performs poorly.

Have also seen split brain in the field in past when folks have tried 2 nodes across 2 data centers that have poor / network latency / consistency / segmentation.

Perhaps that is better now / and they we not properly configured but as @DavidTurner recommend the documentation provided should be used as a reference.

Note thanks David, those pages are new as of 7.7 ... Good to know.

1 Like

It will work ok to have two nodes but it doesn't really offer increased resilience over a single node. Resilience was one of the things you mentioned in the first post, just making it clear that a two-node setup won't achieve this goal.

1 Like

Dear @DavidTurner thanks for your help, I have already ES node in standalone running fine. For my experience I want to create this cluster.
For the secondary node, which role should I assign?

For example: https://logz.io/blog/elasticsearch-cluster-tutorial/

  • Data nodes — stores data and executes data-related operations such as search and aggregation
  • Master nodes — in charge of cluster-wide management and configuration actions such as adding and removing nodes

Thanks.

That question is answered on the manual page I linked to above:

We recommend you assign both nodes all other roles except master eligibility.

1 Like

So both "data" nodes should be fine I suppose, thanks :relaxed:

Dear @DavidTurner @stephenb please help, I'm trying to make this work :cry:
Thanks a lot for your help.

x = node1 ip
y = node2 ip

NODE 1 (originally working) :

ERROR: "Job for elasticsearch.service failed because a timeout was exceeded. See "systemctl status elasticsearch.service" and "journalctl -xe" for details."
gpslogger.log: "[2021-04-27T00:36:36,166][INFO ][o.e.n.Node ] [node-1] initialized
[2021-04-27T00:36:36,184][INFO ][o.e.n.Node ] [node-1] starting ...
[2021-04-27T00:36:37,301][INFO ][o.e.x.s.c.PersistentCache] [node-1] persistent cache index loaded
[2021-04-27T00:36:38,593][INFO ][o.e.t.TransportService ] [node-1] publish_address {x.x.x.x:9300}, bound_addresses {x.x.x.x:9300}
[2021-04-27T00:36:38,789][INFO ][o.e.n.Node ] [node-1] stopping ...
[2021-04-27T00:36:38,841][INFO ][o.e.n.Node ] [node-1] stopped
[2021-04-27T00:36:38,841][INFO ][o.e.n.Node ] [node-1] closing ...
[2021-04-27T00:36:38,866][INFO ][o.e.n.Node ] [node-1] closed
"

[root@cloud-gfhxnqaf ~]# grep -v '^#' /etc/elasticsearch/elasticsearch.yml

cluster.name: gpslogger
node.name: node-1

node.master: true

node.data: true

node.ingest: true

path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

network.host: x.x.x.x
http.host: x.x.x.x
network.bind_host: x.x.x.x

discovery.seed_hosts: ["x.x.x.x", "y.y.y.y"]

cluster.initial_master_nodes: ["x.x.x.x"]

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: false
xpack.security.audit.enabled: false

NODE 2 (new instance, working usually, no data) :

ES starting but can't login anymore:
ERROR:
"type": "security_exception",
"reason": "missing authentication credentials for REST request [/]",
"header": {
"WWW-Authenticate": "Basic realm="security" charset="UTF-8""

[root@vps817928 ~]# grep -v '^#' /etc/elasticsearch/elasticsearch.yml

cluster.name: gpslogger
node.name: node-2

node.master: false

node.data: true

node.ingest: true

path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

network.host: y.y.y.y
http.host: y.y.y.y
network.bind_host: y.y.y.y

discovery.seed_hosts: ["y.y.y.y","x.x.x.x"]

cluster.initial_master_nodes: ["x.x.x.x"]

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.audit.enabled: false

This means that Elasticsearch was starting up normally and was then told to shut down. I'm guessing systemd was what shut it down, but I don't know much about debugging a systemd config, sorry.

Hello David, what about the configurations, does it sound ok for you? Thanks.

In a multi-node cluster you must enable TLS in order to enable security. I suspect this might be why your cluster is not starting up. Your settings will therefore only work for a single-node cluster.

1 Like

I'm not sure it's that, looking at the timings of the log messages it seems that the node is being shut down very soon after startup, just a couple of seconds after logging starting ... so it has validated its config but hasn't had much chance to try contacting other nodes yet.

Hard to give a definite answer whether the Elasticsearch config is good without seeing it successfully start up. The TLS config will likely be an issue to solve in future, but that will be apparent from the logs etc. As of now the problem is that something external is stopping the node.

2 Likes

Found solution for my first node here : How to prevent systemd service start operation from timing out – sleeplessbeastie's notes :slight_smile: increased to 300s.

Now trying to fix secondary node, can't login anymore :frowning:

Error found on first node:
[2021-04-27T14:09:16,076][WARN ][o.e.t.TcpTransport ] [node-1] exception caught on transport layer [Netty4TcpChannel{localAddress=/x.x.x.x:9300, remoteAddress=/y.y.y.y:57064}], closing connection
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: No available authentication scheme

[root@cloud-gfhxnqaf ~]# curl -XGET 'http://x.x.x.x:9200/'
{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm="security" charset="UTF-8""}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm="security" charset="UTF-8""}},"status":401}[root@cloud-gfhxnqaf ~]#

(LOCAL - first node)
[root@cloud-gfhxnqaf ~]# curl -XGET 'http://x.x.x.x:9200/' -u elastic
Enter host password for user 'elastic':
{
"name" : "node-1",
"cluster_name" : "gpslogger",
"cluster_uuid" : "Fy2DiF6LRSq_1UgoLrAaRg",
"version" : {
"number" : "7.11.2",
"build_flavor" : "default",
"build_type" : "rpm",
"build_hash" : "3e5a16cfec50876d20ea77b075070932c6464c7d",
"build_date" : "2021-03-06T05:54:38.141101Z",
"build_snapshot" : false,
"lucene_version" : "8.7.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
[root@cloud-gfhxnqaf ~]#

(LOCAL - secondary node)
[root@vps817928 ~]# curl http://y.y.y.y:9200 -u elastic
Enter host password for user 'elastic':
{"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [elastic] for REST request [/]","header":{"WWW-Authenticate":"Basic realm="security" charset="UTF-8""}}],"type":"security_exception","reason":"unable to authenticate user [elastic] for REST request [/]","header":{"WWW-Authenticate":"Basic realm="security" charset="UTF-8""}},"status":401}[root@vps817928 ~]#

Thanks.

Found :slight_smile: After elasitcsearch turns on authentication, it reports DecoderException: javax.net.ssl.SSLHandshakeException: No available authentic exception - Programmer Sought certificates are required.
I will close this case for the initial cluster setup, thanks for your help!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.