Failed to tls handshake i/o timeout


(Tim Dunphy) #1

Hey all,

Well I was able to get logstash 1.5.1 working with elasticsearch 1.6.0 and kibana 4 sitting behind nxginx! It's a pretty sweet setup. Except for the fact that I can't seem to get logstash-forwarder to connect to logstash. :frowning:

I'm on logstash-forwarder-0.4.0 instaled via RPM on CentOS 7.

When I start the logstash forwarder I get this message repeating over and over again in the logs, and nothing makes its way through to the logstash server:

2015/06/27 02:00:50.479754 Connecting to [1010.10.25]:2541 (es1.mydomain.com)
2015/06/27 02:01:05.482540 Failed to tls handshake with 216.120.248.98 read tcp 10.10.10.25:2541: i/o timeout # <-- Not the real IP. Obscuring it.

I generated SSL certs and keys using the following procedure:

Create CA key

  1. openssl genrsa -des3 -out ca.key 4096

Create CA cert
2) openssl req -new -x509 -days 3650 -key ca.key -out ca.crt

Create es1.mydomain.com key and certificate signing request
3) openssl genrsa -des3 -out es1.mydomain.com.key 4096
4) openssl req -new -key es1.mydomain.com.key -out es1.mydomain.com.csr

Sign the es1.mydomain.com certificate
5) openssl x509 -req -days 3650 -in es1.mydomain.com.csr -CA ca.crt -CAkey ca.key -set_serial 01 -out es1.mydomain.com.crt

Remove the password from the es1.mydomain.com private key
6) openssl rsa -in es1.mydomain.com -out es1.mydomain.com.key

I placed the cert/key in this location:

-rw-------. 1 logstash logstash 2004 Jun 27 00:10 /etc/pki/tls/certs/es1.mydomain.com.crt
-rw-------. 1 logstash logstash 3243 Jun 27 00:11 /etc/pki/tls/private/es1.mydomain.com.key

For my input section I have this:

input {


   lumberjack {
       # The port to listen on
       port => 2541

       # The paths to your ssl cert and key
       ssl_certificate => "/etc/pki/tls/certs/es1.mydomain.com.crt"
       ssl_key => "/etc/pki/tls/private/es1.mydomain.com.key"

         # Set this to whatever you want.
         type => "logstash"
         codec => "json"
       }
}

And when I start logstash I can verify that it's listening on the specified port:

 #lsof -i :2541 | head -5
COMMAND    PID     USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
logstash- 5106     root    8u  IPv4 24517416      0t0  TCP logs.mydomain.com:33224->logs.mydomain.com:lonworks2 (ESTABLISHED)
java      6398 logstash   16u  IPv6 13842083      0t0  TCP *:lonworks2 (LISTEN)
java      6398 logstash 3012u  IPv6 13913271      0t0  TCP logs.mydomain.com:lonworks2->logs.mydomain.com:45634 (CLOSE_WAIT)
java      6398 logstash 3464u  IPv6 13922088      0t0  TCP logs.mydomain.com:lonworks2->logs.mydomain.com:47063 (CLOSE_WAIT)

In my logstash-forwarder conf I have this:

{
  "network": {
    "servers": [ "logs.mydomain.com:2541" ],
    "ssl ca": "/etc/pki/CA/certs/ca.crt",
    "timeout": 15
  },

Not sure at all where the error lies. But some troubleshooting tips and advice would be greatly appreciated!

Thanks


(Mark Walkom) #2

Did you add logs.mydomain.com as an alternate domain when creating the certificate (it doesn't look like it).


(Tim Dunphy) #3

Did you add logs.mydomain.com as an alternate domain when creating the certificate (it doesn't look like it).

I have two domains for this host.

I was planning on keeping es1 as the main name of the host that I was going to base the certs on in the 'common name' section of the cert. And only use the logs DNS address to host the site through nginx.

I'm not sure what you're referring to as the 'alternate domain' for the cert. The only thing that refers to a machine name in a cert that I know if is the common name.

But do you think that the name I'm using in the cert matters? If so I have no problem only using the name logs.


(Mark Walkom) #4

Try with the logs name and see if that resolves it.


(Tim Dunphy) #5

I set the hostname back to logs.. No difference. :frowning:

2015/06/28 00:52:17.160067 Connecting to [10.10.10.25]:2541 (logs.mydomain.com)
2015/06/28 00:52:33.163091 Failed to tls handshake with 10.10.10.25 read tcp 10.10.10.25:2541: i/o timeout
2015/06/28 00:52:34.164918 Connecting to [10.10.10.25]:2541 (logs.mydomain.com)

This is my current lumberjack input:

lumberjack {
       # The port to listen on
       port => 2541

       # The paths to your ssl cert and key
       ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
       ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"

         # Set this to whatever you want.
         type => "logstash"
         codec => "json"
       }

This is my logstash-forwarder config:

{
  # The network section covers network configuration :)
  "network": {
    # A list of downstream servers listening for our messages.
    # logstash-forwarder will pick one at random and only switch if
    # the selected one appears to be dead or unresponsive
    "servers": [ "logs.jokefire.com:2541" ],

    # The path to your client ssl certificate (optional)
    #"ssl certificate": "./logstash-forwarder.crt",
    # The path to your client ssl key (optional)
    #"ssl key": "./logstash-forwarder.key",

    # The path to your trusted ssl CA file. This is used
    # to authenticate your downstream server.
    "ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt",

    # Network timeout in seconds. This is most important for
    # logstash-forwarder determining whether to stop waiting for an
    # acknowledgement from the downstream server. If an timeout is reached,
    # logstash-forwarder will assume the connection or server is bad and
    # will connect to a server chosen at random from the servers list.
    "timeout": 15
  },

I used this guide to create my setup so far:

[Setup ELK on CentOS 7][1]
[1]: https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-4-on-centos-7

And I regenerated my key using the instructions in there which said to do basically this:

cd /etc/pki/tls
sudo openssl req -subj '/CN=logstash_server_fqdn/' -x509 -days 3650 -batch -nodes -newkey rsa:2048 -keyout private/logstash-forwarder.key -out certs/logstash-forwarder.crt

Got any other ideas?

Thanks


(Mark Walkom) #6

IO timeout may suggest networking problems, can you telnet from the LSF host to the LS host?


(Tim Dunphy) #7

Hi Warkolm,

Well, I'm testing from two different places. I tried making the LS host itself the first client. So network problems wouldn't really come into play there.

And for the other host that I'm trying to make an LSF client, I can telnet into the port that I'm running LS on on the LS host:

[root@ops:~] #telnet logs.mydomain.com 2541
Trying 10.10.10.25...
Connected to logs.mydomain.com.
Escape character is '^]'.

And of course I'm able to telnet into that port from the LS host itself where I am also running LSF.

And yet, the issue persists:

2015/06/28 16:10:03.467502 Connecting to logs.mydomain.com:2541
2015/06/28 16:10:18.484486 Failed to tls handshake with logs.mydomain.com:2541 read tcp 10.10.10.25:2541: i/o timeout

(Tim Dunphy) #9

Hey guys, I actually got this working by adding the IP of the elasticsearch host in the output section. And I had some results turning up in the Kibana interface from among the 22 servers I was running the logstash-forwarder on. That was cool!!

I'm still getting the i/o timeout issue in logstash forwarder. But I was getting results in kibana even while those errors were occurring.

However after a few hours results stopped turning up in the Kibana interface again. I'll open up another question on that problem.

Thanks for your help!


(gabriel) #10

Try to check if you apply some kind of policy over /tmp partition like noexec option, if so remove it and remount the partition...

Regards ...


(system) #11