Elasticsearch cluster setup with 3 nodes

Hi,
I am trying to setup elastic-search 8.3 cluster with 3 nodes(2 master, 1 data node) in VMWare(ubuntu os) (IP's - 192.168.15.11(master), 192.168.15.12(master), 192.168.15.13).
Below is the elasticsearch.yml file from one master node -

cluster.name: rh-cluster
node.name: node-1
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
#bootstrap.memory_lock: true
network.host: 192.168.15.11
http.port: 9200
discovery.seed_hosts: ["192.168.15.11", "192.168.15.12", "192.168.15.13"]
cluster.initial_master_nodes: ["node-1", "node-2"]
#readiness.port: 9399
#action.destructive_requires_name: false
xpack.security.enabled: true
xpack.security.enrollment.enabled: true

xpack.security.http.ssl:
  enabled: true
  keystore.path: certs/http.p12

xpack.security.transport.ssl:
  enabled: true
  verification_mode: certificate
  keystore.path: certs/transport.p12
  truststore.path: certs/transport.p12

#http.host: 0.0.0.0
http.host: [_local_, _site_]

#transport.host: 0.0.0.0
transport.host: [_local_, _site_]

I am getting below cluster health response -
curl -k -u elastic:elastic -XGET https://192.168.15.11:9200/_cluster/health?pretty

{
  "cluster_name" : "rh-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 2,
  "active_shards" : 4,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

But in log file I can see below warnings -

[2022-07-21T01:54:29,863][WARN ][o.e.d.PeerFinder         ] [node-1] address [192.168.15.13:9300], node [null], requesting [false] discovery result: [][192.168.15.13:9300] connect_exception: Connection refused: /192.168.15.13:9300: Connection refused
[2022-07-21T01:54:30,601][WARN ][o.e.c.c.ClusterFormationFailureHelper] [node-1] master not discovered yet, this node has not previously joined a bootstrapped cluster, and this node must discover master-eligible nodes [192.168.15.11, 192.168.15.12] to bootstrap a cluster: have discovered [{node-1}{RSRD69EIQna5g}{DqcocwPgTV}{node-1}{192.168.15.11}{192.168.15.11:9300}{cdfhilmrstw}]; discovery will continue using [192.168.15.12:9300, 192.168.15.13:9300] from hosts providers and [{node-1}{RSRD69EIQna5g}{DqcocwPgTV}{node-1}{192.168.15.11}{192.168.15.11:9300}{cdfhilmrstw}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2022-07-21T01:54:30,863][WARN ][o.e.d.PeerFinder         ] [node-1] address [192.168.15.12:9300], node [null], requesting [false] discovery result: [][192.168.15.12:9300] connect_exception: Connection refused: /192.168.15.12:9300: Connection refused
[2022-07-21T01:54:30,863][WARN ][o.e.d.PeerFinder         ] [node-1] address [192.168.15.13:9300], node [null], requesting [false] discovery result: [][192.168.15.13:9300] connect_exception: Connection refused: /192.168.15.13:9300: Connection refused
[2022-07-21T01:54:31,863][WARN ][o.e.d.PeerFinder         ] [node-1] address [192.168.15.12:9300], node [null], requesting [false] discovery result: [][192.168.15.12:9300] connect_exception: Connection refused: /192.168.15.12:9300: Connection refused
[2022-07-21T01:54:31,863][WARN ][o.e.d.PeerFinder         ] [node-1] address [192.168.15.13:9300], node [null], requesting [false] discovery result: [][192.168.15.13:9300] connect_exception: Connection refused: /192.168.15.13:9300: Connection refused
[2022-07-21T01:54:32,863][WARN ][o.e.d.PeerFinder         ] [node-1] address [192.168.15.12:9300], node [null], requesting [false] discovery result: [][192.168.15.12:9300] connect_exception: Connection refused: /192.168.15.12:9300: Connection refused
[2022-07-21T01:54:32,863][WARN ][o.e.d.PeerFinder         ] [node-1] address [192.168.15.13:9300], node [null], requesting [false] discovery result: [][192.168.15.13:9300] connect_exception: Connection refused: /192.168.15.13:9300: Connection refused
[2022-07-21T01:54:33,863][WARN ][o.e.d.PeerFinder         ] [node-1] address [192.168.15.12:9300], node [null], requesting [false] discovery result: [][192.168.15.12:9300] connect_exception: Connection refused: /192.168.15.12:9300: Connection refused

Please suggest why I am having issue or any configuration is wrong in yml file.
Thank you in advance.

1 Like

Welcome to our community! :smiley:

Please format your code/logs/config using the </> button, or markdown style back ticks. It helps to make things easy to read which helps us help you :slight_smile:

Thank you Mark.

I have formatted the content. Please help me to fix the issue.

Can you curl 192.168.15.12:9200?

I tried curl in all three nodes -
curl -k -u elastic:elastic -XGET https://192.168.15.11:9200/_cluster/health?pretty
curl -k -u elastic:elastic -XGET https://192.168.15.12:9200/_cluster/health?pretty
curl -k -u elastic:elastic -XGET https://192.168.15.13:9200/_cluster/health?pretty

getting proper response.

curl 192.168.15.12:9200 giving below response -
curl: (52) Empty reply from server

What do the logs from that node show?

1 Like

Below is the log from node-2.

Caused by: sun.security.validator.ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
        at sun.security.validator.PKIXValidator.doValidate(PKIXValidator.java:318) ~[?:?]
        at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:267) ~[?:?]
        at sun.security.validator.Validator.validate(Validator.java:256) ~[?:?]
        at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:285) ~[?:?]
        at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144) ~[?:?]
        at org.elasticsearch.common.ssl.DiagnosticTrustManager.checkServerTrusted(DiagnosticTrustManager.java:102) ~[elasticsearch-ssl-config-8.3.2.jar:?]
        at sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329) ~[?:?]
        at sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226) ~[?:?]
        at sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169) ~[?:?]
        at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396) ~[?:?]
        at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480) ~[?:?]
        at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277) ~[?:?]
        at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264) ~[?:?]
        at java.security.AccessController.doPrivileged(AccessController.java:712) ~[?:?]
        at sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209) ~[?:?]
        at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1548) ~[netty-handler-4.1.76.Final.jar:?]
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1394) ~[netty-handler-4.1.76.Final.jar:?]
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1235) ~[netty-handler-4.1.76.Final.jar:?]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1284) ~[netty-handler-4.1.76.Final.jar:?]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510) ~[netty-codec-4.1.76.Final.jar:?]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449) ~[netty-codec-4.1.76.Final.jar:?]
        ... 16 more
Caused by: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
        at sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:157) ~[?:?]
        at sun.security.provider.certpath.PKIXCertPathValidator.engineValidate(PKIXCertPathValidator.java:83) ~[?:?]
        at java.security.cert.CertPathValidator.validate(CertPathValidator.java:309) ~[?:?]
        at sun.security.validator.PKIXValidator.doValidate(PKIXValidator.java:313) ~[?:?]
        at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:267) ~[?:?]
        at sun.security.validator.Validator.validate(Validator.java:256) ~[?:?]
        at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:285) ~[?:?]
        at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144) ~[?:?]
        at org.elasticsearch.common.ssl.DiagnosticTrustManager.checkServerTrusted(DiagnosticTrustManager.java:102) ~[elasticsearch-ssl-config-8.3.2.jar:?]

This is a bad idea as you always want to have 3 master eligible nodes in a cluster and at least 2 nodes that hold data.

I followed below steps to setup elasticsearch in all three nodes.
1- Installed ES using debian package in all three nodes.
2- copy 'certs' folder from node-1 to node-2 & node-3 to set the xpack.security.
3- remove and add the 'http' and 'transport' secure password into node-2 and node-3 from node-1 using below commands because ES by default create the password at the time of installation -

/usr/share/elasticsearch/bin/elasticsearch-keystore remove xpack.security.http.ssl.keystore.secure_password
/usr/share/elasticsearch/bin/elasticsearch-keystore remove xpack.security.transport.ssl.keystore.secure_password
/usr/share/elasticsearch/bin/elasticsearch-keystore remove xpack.security.transport.ssl.truststore.secure_password

/usr/share/elasticsearch/bin/elasticsearch-keystore add xpack.security.http.ssl.keystore.secure_password
/usr/share/elasticsearch/bin/elasticsearch-keystore add xpack.security.transport.ssl.keystore.secure_password
/usr/share/elasticsearch/bin/elasticsearch-keystore add xpack.security.transport.ssl.truststore.

Then the rest of other normal steps.

The above steps is fine to change the password or do I need to create the certificate without password and copy to other 2 nodes ?

Also, what do you suggest with 3 nodes cluster. how many should be master and data nodes? Thanks

I would recommend all nodes being both master and data, which is the default configuration and ideal for small clusters.

Thank you for the reply Christian. That I'll update as you suggested.
But the issue I am facing is the one I have mentioned above. :slight_smile:
Also can you please provide your view on my approach to change the http and transport password?
Thanks

Hello
I am trying simmilar scenario as you:
(1 master, 1 data,ingest)
This is just for test purposes. But i have a same problem as you:

My configuration is:

cluster.name: elk-test-pavel
node.name: pavel-elastic-master-01.local
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
node.roles: [ master ]
discovery.seed_hosts: ["pavel-elastic-master-01.local"]
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
xpack.security.http.ssl:
  enabled: true
  keystore.path: certs/http.p12
xpack.security.transport.ssl:
  enabled: true
  verification_mode: certificate
  keystore.path: certs/transport.p12
  truststore.path: certs/transport.p12
cluster.initial_master_nodes: ["pavel-elastic-master-01.local"]
http.host: 0.0.0.0
transport.host: 0.0.0.0

After i put this configuration to elasticsearch.yml and start the node, everything goes smooth, but after i wanna add data node to this cluster, and i try to get:

  • enrollment token by: bin/elasticsearch-create-enrollment-token -s node
    I allways endup with
Unexpected http status [401] while attempting to determine cluster health. Will retry at most 5 more times.

I tried to left away node: [ master ] parameter and it helps but after that this node cannot be a master node but it becomes coordinate node.
I need to have specified nodes like this:

  1. node: [ master]
  2. node [ data, ingest ]

This setup works pretty good in elastic version 7 but with 8 comes this struggles which i described above.

Just for the record i tried to go throught this guide:

Are you trying to run before you can walk? Have you managed to get a 3 node cluster working without security? After that you could start trying to implement it. At least thats what I did.

Hi @warkolm ,

Please help me to clear one of my doubt.
When we install Elasticsearch in different-2 nodes, it creates default certificates with default passwords in each and individual nodes.
Should we use the default certificates for each nodes(which is different for each node with different keystore password) OR do we need to create same certificate for each node for cluster setup ?

Hello @warkolm
I am new here in this community.
I have a three nodes cluster but two nodes are on the same virt and one on another.
I need to set them like three masters and three data nodes. When I am updating the virt I need to shut down two of the cluster nodes and I lost quorum and elasticsearch stopped, how can I avoid that when discovery.zen.minimum_master_nodes is deprecated since 6 where I last worked.
thank you for help!

@pavelsidla @vnovotny98 please start your own topics for your questions.