ELK with SSL, possible issues?

nick.ilarionov · January 23, 2020, 9:37am

Hello all!

I am fairly new to the ELK stack and am trying to run a Proof of Concept of it. Getting it up and running under http was fairly easy, following the guides. However, as I want to have SSL enabled on all communications, I am now struggling quite a bit

So long story short, here is my problem.
The host is a VM running Ubuntu 18.04 LTS. The elasticsearch log is getting spammed with the following warning (I know it is just a warning, but I don't like having it there).

[2020-01-23T08:22:34,260][WARN ][o.e.h.n.Netty4HttpServerTransport] [primo] caught exception while handling client http traffic, closing connection [id: 0xbab5aee4, L:0.0.0.0/0.0.0.0:9200 ! R:/127.0.0.1:38978]
io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 474554202f5f636c75737465722f6865616c74682f5f616c6c3f6c6f63616c3d747275652674696d656f75743d36307320485454502f312e310d0a436f6e74656e742d4c656e6774683a20300d0a486f73743a203132372e302e302e313a393230300d0a436f6e6e656374696f6e3a204b6565702d416c6976650d0a557365722d4167656e743a204170616368652d48747470436c69656e742f342e352e3420284a6176612f312e382e305f323332290d0a4163636570742d456e636f64696e673a20677a69702c6465666c6174650d0a0d0a
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:472) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [netty-common-4.1.32.Final.jar:4.1.32.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
Caused by: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 474554202f5f636c75737465722f6865616c74682f5f616c6c3f6c6f63616c3d747275652674696d656f75743d36307320485454502f312e310d0a436f6e74656e742d4c656e6774683a20300d0a486f73743a203132372e302e302e313a393230300d0a436f6e6e656374696f6e3a204b6565702d416c6976650d0a557365722d4167656e743a204170616368652d48747470436c69656e742f342e352e3420284a6176612f312e382e305f323332290d0a4163636570742d456e636f64696e673a20677a69702c6465666c6174650d0a0d0a
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1182) ~[netty-handler-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1247) ~[netty-handler-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]
        ... 15 more

The configuration on the elasticsearch.yml is as follows:

cluster.name: test
node.name: primo
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: "localhost"
http.port: 9200
discovery.zen.ping.unicast.hosts: ["localhost"]
discovery.zen.minimum_master_nodes: 1
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: false
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.key: /etc/elasticsearch/config/certs/key.key
xpack.security.transport.ssl.certificate: /etc/elasticsearch/config/certs/cert.pem
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.key:  /etc/elasticsearch/config/certs/key.key
xpack.security.http.ssl.certificate: /etc/elasticsearch/config/certs/cert.pem

The certificates are self-signed, thus I haven't specified the CA. Not sure if this would be an issue.
It is up and running, but as it is a single-node cluster the status is showing as yellow:

curl -XGET -k -u '<user>:<password>' 'https://localhost:9200/_cluster/health?pretty'
{
  "cluster_name" : "test",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 75,
  "active_shards" : 75,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 55,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 57.692307692307686
}

curl -XGET -k -u '<user>:<password>' 'https://localhost:9200/?pretty'
{
  "name" : "primo",
  "cluster_name" : "test",
  "cluster_uuid" : "9I9BmTr1SVaKV9GoSYSGwA",
  "version" : {
    "number" : "6.8.6",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "3d9f765",
    "build_date" : "2019-12-13T17:11:52.013738Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.2",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

Based on my research over the last few days, this problem would usually be because of other tool/system trying to connect to the elasticsearch via http and not https. So this is why I stopped both Kibana and Logstash completely. At the moment the only things running on the machine is elasticsearch and nginx (pointing to the Kibana GUI). The logs are still showing up constantly.

Any pointers on why those are showing and where to look further will be of great help, as it is clearly this machine trying to talk to itself.

Also, I was not able to find a setting that would disable clustering for the elasticsearch, and this at the moment is my primary suspect.

DavidTurner · January 23, 2020, 10:11am

That is correct. The only solution is for you to work out what is sending these requests and stop it. Here is the content of the request, if it helps:

GET /_cluster/health/_all?local=true&timeout=60s HTTP/1.1
Content-Length: 0
Host: 127.0.0.1:9200
Connection: Keep-Alive
User-Agent: Apache-HttpClient/4.5.4 (Java/1.8.0_232)
Accept-Encoding: gzip,deflate

So it's something that uses Apache-HttpClient/4.5.4 (Java/1.8.0_232) which is asking for the cluster health. The log message also tells you the address and port that the client uses: R:/127.0.0.1:38978. That means it's on the same machine. Maybe you can catch the culprit by using sudo netstat -antp to find the process id that was using the port. Note that the port may well change between requests, so you might have to keep trying until you find a match.

DavidTurner · January 23, 2020, 10:33am

I don't think this is Elasticsearch, by the way. Elasticsearch doesn't talk to itself much over HTTP, and this cluster health API call is kinda strange (why say _all, and why use local=true?) Also I think we jumped from HttpClient 4.5.2 in 6.8 to 4.5.7 in 7.0 so there isn't a version of Elasticsearch that uses 4.5.4:

$ git diff 6.8..v7.0.0 -- buildSrc/version.properties | grep httpclient
-httpclient        = 4.5.2
+httpclient        = 4.5.7

nick.ilarionov · January 23, 2020, 10:39am

Hello David,

Thank you very much for the prompt reply.

I knew I was dead stuck on this. Mentioning the apache-httpclient was a great refresher and I found the culprit - as this is a PoC machine, there was an instance of another service using elasticsearch. Disabled that one and all looks good at the moment. Now I can go back to configuring the .net core app to send the messages properly.

Again. Thanks a great deal.

Cheers!

system · February 20, 2020, 10:39am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Continuos warning throwing up in fresh installation of a 3 node elk cluster's 1st node. Please help to sort out Elasticsearch	2	23	August 7, 2024
Elastic search error : bad certificate Elasticsearch elastic-stack-security	2	1336	December 29, 2022
Elasticsearch 6.8 security update error while configuring it Elasticsearch elastic-stack-security	12	4073	July 9, 2019
Filebeat -> Logstash: SSL error Elasticsearch	2	2091	September 17, 2018
SSL errors remaining after upgrade to 7.5.0 Elasticsearch elastic-stack-security	2	5608	January 17, 2020

ELK with SSL, possible issues?

Related topics