[SOLVED] After several months, logstash input file stop working

(Alejandro Olivan) #1

Hi forum...

logstash is driving me nuts...
My logstash servers are restarted every two ours, in order to keep the cluster up... but even so, there are some servers that can refuse to read some files for days... and then ... they do again (while others stop)... it is crazy...

There but one server that refuses to read almost any file... it seem to read a few ones randomly before stopping any processing... until nex restart.

There is absolutely no error, no log, that I could refer about this.
I have tried launching logstash from cli in verbose debug way... but it shows nothing.
Sometimes, after a restart, while tailing stdout you can see flowing some logs... then it goes silent.

it seems to occur on logstash shipper processes (logstash processes runing on the service servers that feed redis servers).

The most curious thing is that l refused to run 1.5 because it was terribly unstable and downgraded to 1.4.3 and the whole thing ran for almost 2 months on its own... now all servers have been periodically updated to 1.4.5 and the whole thing smokes everywhere...

hope someone could give some clue...

best regards

(Alejandro Olivan) #2

Got it working again...

Several factors have been concurrently applied here:

1st... ugrade from 1.4.5 to 1.5.4 .... this did not solve the problem
2nd... split wildcard file inputs stanzas into individual input stanzas (one per current file) in order to force more threads. This is a big problem in configuration maintenance... this made no difference
3rd... modified group in init.d script to adm... this was the former 1.4.5 setup, and no go..
4th... ensure all parent directories to final log files have execution bit set to everyone... BINGO!

So, althoug I have abandoned further testing as soon as it worked, maybe this could help someone... It seems read permissions and ownership are not enough... I have recurively setup all folders to AT LEAST 755, not just 744/644 for everyfile that had to be accesed...

ufffff! :sweat_smile:

best regards

(system) #3