Tls errors, max mem, max clients... sudden dead everywhere!

alexolivan · June 3, 2015, 2:01pm

Hi again forum!

...I'm sad as my brand new toy lasted very little time... thank God I kept ald stuff on.
As I decided that ELK may still be suitable to have nice stats on services for IT staff, I started creting my filters and got my elasticsearch mini cluster up, and so .... but as I added a few logstash-forwarder clients with several services, problems aose very quick...

First was logstash stopping out of memory.... inceasing max mem limit is a stopgap, so I upgraded my debian repo from 1.4 to 1.5 and upgrade, since I read that there is a memory leak on TCP connections (I use them) that is fixed on 1.5.
Second, upgrading was not clean... since I have added some patterns on the patterns dir there were problems deleting old folders (fortunately!!!! that saved up my filters!!!!) and upgrade was a little bit bitter
-Third , As I recover from memory problems (increased the mem limit anyway) there seems to be connection problems from the LSF clients... the max_connection limit appeears on the log... and I'm starting to ecome worried...
Trying to find where is the connection limit defined, I ended up discovering it is defined nowhere...
Out of desperation I read that increasing threads may solve the problem... so I added -w X on the init.d script, since I'm unable to find the thread parameter nowhere on the /etc/default/ file... and tried again.
The problem persists.... now being TLS handshake everywhere...
In addition, since -w 8 usage, start/stop is horrible...

So.... as you can read, it all is out of service... very very disapointing and, of course, far from being considered for production.... but I read people tah has being using it on thousands of servers!!! why now is not possible to run more than just 10, 12 servers? I got plenty of CPU and RAM everywhere.... and the datacenter eth connection is gigabit everywhere.... in all no system is overloaded...

So a massive full failure is really strange.... do you believe I'm missing something? it all apears a nightmare!

Best regards

alexolivan · June 3, 2015, 2:09pm

Is it possible to have several lumberjack inputs listening on different ports???

It may be a way to deal with this mess...

magnusbaeck · June 3, 2015, 5:26pm

Yes, that should work just fine.

alexolivan · June 5, 2015, 1:12am

aha... I tried it but it didn't work... ie the syntax I used was not found wrong and the service started, but by issuing netstat -tulnp I realized no listening was occuring on the port.
This is something I didnt found anywhere in google... no single example, post or howto... very strange.
maybe you could put an example on how to do it... i just copypasted my single lumberjack input to a new one and changed the port number... no complains, but no joy.

So... I returned to my Debianist philosophy: I downgraded blleding edge 1.5 to former last 1.4 ( that I consider stable enough ).
I'm installing a redis server.
I will drop "sid like/testing" logstash-forwarder marvels and replace them with 1.4 logstash on every reporting server.
So, I will distribute log procesing at the origin end, sending to a common redis broker server, and using a single logstash server receiving end, to ingest pre formated/groked data from redis just to inject it to logstash.
If possible, I would stay away of TLS.

This is something I have seen several old howtos around... I bet it will work more robust and sclable.
I'm also scaling up ES implementing multiple master, and multiple data client nodes (I still thinking how to balance them... I think on nginx)... so the creature is exploding in complexity... hope it all pays the efforts

Best regards!

Topic		Replies	Views
Logstash Lumberjack Input Stops Processing Logstash	9	1924	July 6, 2017
Logstash 1.5.6 doesn't accept connections from logstash-forwarder under load Logstash	4	1091	July 6, 2017
Problems receiving data from large number of logstash-forwarders on logstash Logstash	9	3147	July 6, 2017
Logstash 2.3.4 memory leak Logstash	16	1928	July 6, 2017
Read error looking for ack: EOF Logstash	5	2056	July 6, 2017

Tls errors, max mem, max clients... sudden dead everywhere!

Related topics