Cluster is not established

I have 2 EC2 instances one on private network 1 AZ1 and other on private network 2 AZ 2 and try to establish a cluster.

The configuration is similar on both nodes, node.name changes

# Add your configuration lines here
http.host: 0.0.0.0
path.data: /data
path.logs: /var/log/elasticsearch

xpack.security.enabled: true
xpack.security.enrollment.enabled: true

# Enable encryption for HTTP API client connections, such as Kibana, Logstash, and Agents
xpack.security.http.ssl:
  enabled: true
  keystore.path: certs/http.p12
  keystore.secure_password: "xjKhYWZxTAW29q4Jv8ki9w"

# Enable encryption and mutual authentication between cluster nodes
xpack.security.transport.ssl:
  enabled: true
  verification_mode: certificate
  keystore.path: certs/transport.p12
  truststore.path: certs/transport.p12
  keystore.secure_password: "xk3xx54nT92ZZeSALGbaKQ"

cluster.name: dev
node.name: elastic-1b
bootstrap.memory_lock: true
network.host: [_local_,_site_]

discovery.seed_providers: ec2
discovery.ec2.endpoint: ec2.eu-central-1.amazonaws.com
discovery.ec2.tag.cluster: dev
cloud.node.auto_attributes: true
cluster.routing.allocation.awareness.attributes: aws_availability_zone
logger.org.elasticsearch.discovery.ec2: "TRACE"
cluster.initial_master_nodes: [elastic-1a,elastic-1b]

When I execute on EC2 elastic-1b
sudo curl --cacert /etc/elasticsearch/certs/http_ca.crt -u elastic https://localhost:9200/_cat/nodes

I get as a result only one node
10.0.16.100 17 97 1 0.07 0.04 0.04 cdfhilmrstw * elastic-1b

what am I doing wrong and the cluster can not be established?

Welcome to our community! :smiley:

Is there connectivity between these two AZs? What do your Elasticsearch logs show?

Hi thanks,

yes there is connection between AZs
From the command line I make a ping from EC2 with IP 10.0.0.100/20 to EC2 10.0.16.100/20 and the connectivity is ok.

[ec2-user@ip-10-0-0-100 ~]$ sudo curl --cacert /etc/elasticsearch/certs/http_ca.crt -u elastic https://localhost:9200/_cat/nodes
Enter host password for user 'elastic':
10.0.0.100 53 94 0 0.00 0.00 0.00 cdfhilmrstw * elastic-1a
[ec2-user@ip-10-0-0-100 ~]$ sudo ping 10.0.16.100
PING 10.0.16.100 (10.0.16.100) 56(84) bytes of data.
64 bytes from 10.0.16.100: icmp_seq=1 ttl=127 time=0.624 ms
64 bytes from 10.0.16.100: icmp_seq=2 ttl=127 time=0.632 ms
64 bytes from 10.0.16.100: icmp_seq=3 ttl=127 time=0.629 ms
64 bytes from 10.0.16.100: icmp_seq=4 ttl=127 time=0.603 ms
64 bytes from 10.0.16.100: icmp_seq=5 ttl=127 time=0.637 ms
^C
--- 10.0.16.100 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4170ms
rtt min/avg/max/mdev = 0.603/0.625/0.637/0.011 ms
[ec2-user@ip-10-0-0-100 ~]$

The logs from /var/log/elasticsearch/dev.log where dev is the cluster name

[2023-05-18T08:06:05,010][WARN ][o.e.h.n.Netty4HttpServerTransport] [elastic-1a] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/10.0.0.100:9200, remoteAddress=/10.0.4.149:38832}
[2023-05-18T08:06:07,508][WARN ][o.e.h.n.Netty4HttpServerTransport] [elastic-1a] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/10.0.0.100:9200, remoteAddress=/10.0.4.149:59632}
[2023-05-18T08:06:08,928][WARN ][o.e.h.n.Netty4HttpServerTransport] [elastic-1a] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/10.0.0.100:9200, remoteAddress=/10.0.4.149:59638}
[2023-05-18T08:06:10,011][WARN ][o.e.h.n.Netty4HttpServerTransport] [elastic-1a] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/10.0.0.100:9200, remoteAddress=/10.0.4.149:59648}
[2023-05-18T08:06:11,332][WARN ][o.e.h.n.Netty4HttpServerTransport] [elastic-1a] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/10.0.0.100:9200, remoteAddress=/10.0.4.149:59650}
[2023-05-18T08:06:11,349][WARN ][o.e.h.n.Netty4HttpServerTransport] [elastic-1a] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/10.0.0.100:9200, remoteAddress=/10.0.4.149:59662}
[2023-05-18T08:06:11,349][WARN ][o.e.h.n.Netty4HttpServerTransport] [elastic-1a] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/10.0.0.100:9200, remoteAddress=/10.0.4.149:59678}
[2023-05-18T08:06:12,511][WARN ][o.e.h.n.Netty4HttpServerTransport] [elastic-1a] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/10.0.0.100:9200, remoteAddress=/10.0.4.149:59690}
[2023-05-18T08:06:15,009][WARN ][o.e.h.n.Netty4HttpServerTransport] [elastic-1a] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/10.0.0.100:9200, remoteAddress=/10.0.4.149:59696}
[2023-05-18T08:06:15,785][WARN ][o.e.h.n.Netty4HttpServerTransport] [elastic-1a] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/10.0.0.100:9200, remoteAddress=/10.0.4.149:59702}

Pinging between nodes does not show there is connectivity. Can you telnet to port 9300 on the other nodes in the cluster from every node and get an appropriate error (am not expecting telnet to succeed due to security, but it should be able to reach the port)?

From the first node(EC2) I get

[ec2-user@ip-10-0-0-100 ~]$ telnet 10.0.16.100 9300
Trying 10.0.16.100...
Connected to 10.0.16.100.
Escape character is '^]'.

From the second node(EC2) I get

[ec2-user@ip-10-0-16-100 ~]$ telnet 10.0.0.100 9300
Trying 10.0.0.100...
Connected to 10.0.0.100.
Escape character is '^]'

That seems fine. What is in the Elasticsearch logs around your configured EC2 discovery when the node starts up?

As an example from the logs when instances boot

[2023-05-18T08:55:33,532][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [elastic-1a] client did not trust this server's certificate, closing connection Netty4TcpChannel{localAddress=/10.0.0.100:9300, remoteAddress=/10.0.16.100:42124, profile=default}

Another log
[2023-05-18T08:55:33,775][WARN ][o.e.c.s.DiagnosticTrustManager] [elastic-1a] failed to establish trust with server at [<unknown host>]; the server provided a certificate with subject name [CN=ip-10-0-16-100.eu-central-1.compute.internal], fingerprint [13d8fa575d0f2551f0b54375b061aadc26084ced], no keyUsage and no extendedKeyUsage; the certificate is valid between [2023-05-18T08:53:41Z] and [2122-04-24T08:53:41Z] (current time is [2023-05-18T08:55:33.775404267Z], certificate dates are valid); the session uses cipher suite [TLS_AES_256_GCM_SHA384] and protocol [TLSv1.3]; the certificate does not have any subject alternative names; the certificate is issued by [CN=Elasticsearch security auto-configuration HTTP CA]; the certificate is signed by (subject [CN=Elasticsearch security auto-configuration HTTP CA] fingerprint [ba3e965eb9a259c3823bb5a9222f1e8055d56596]) which is self-issued; the [CN=Elasticsearch security auto-configuration HTTP CA] certificate is not trusted in this ssl context ([xpack.security.transport.ssl (with trust configuration: StoreTrustConfig{path=certs/transport.p12, password=<non-empty>, type=PKCS12, algorithm=PKIX})]); this ssl context does trust a certificate with subject [CN=Elasticsearch security auto-configuration HTTP CA] but the trusted certificate has fingerprint [4aa66b94e99bf53d5ef73f82d1c3c2e7d6cccc1f] sun.security.validator.ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors

Another log
[2023-05-18T08:55:33,778][WARN ][o.e.t.TcpTransport ] [elastic-1a] exception caught on transport layer [Netty4TcpChannel{localAddress=/10.0.0.100:42294, remoteAddress=/10.0.16.100:9300, profile=default}], closing connection io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors

  • certificate is not trusted in this ssl context
  • PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors

Check path & permissions to the root certificate

I guess you mean the following

[ec2-user@ip-10-0-0-100 ~]$ sudo ls -la /etc/elasticsearch/certs
total 40
drwxr-x---. 2 root elasticsearch    62 May 18 08:54 .
drwxr-s---. 5 root elasticsearch 16384 May 18 08:54 ..
-rw-rw----. 1 root elasticsearch 10109 May 18 08:54 http.p12
-rw-rw----. 1 root elasticsearch  1915 May 18 08:54 http_ca.crt
-rw-rw----. 1 root elasticsearch  5854 May 18 08:54 transport.p12

Yes, that, especially http_ca.crt

Really no idea what to change :smiley: , all are the default by elastic

Did you try to google for setting up Elastic with certificates? There are some good tutorials out there!

It also helps to understand how the components work together.

I have managed it by generating a new p12 certificates. Thanks

Good to know, thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.