Looks like currently there is no way for config live reload (please, correct me if I am wrong). And the only option is to shutdown logstash and start it again.
In that case, question - how much logs I will loose if I have pipeline that reads from file and sends to elasticsearch. My understanding is:
File pointer is being saved, so no log messages duplication or loss.
1.a If log file got rotated while logstash is down - messages are being lost from remaining rotated file.
1.b In case of rotation, will there be conflict with saved pointer to file position
Probably small amount of messages being lost from internal logstash queue
Messages that are subject to shipping or in the middle of shipping to elasticsearch are also being lost.
Am I correct?
What are the options to prevent log loss and allow to add new files to config at runtime?
With the revamped pipeline flush in Logstash 2.0 (I think) it don't think the two internal 20-item queues will be lost. The only thing I'd worry about is file rotations. Sadly the file input documentation isn't too specific about what happens.
Is there a documentation of that feature for next release?
There's a roadmap page in the online documentation.
Question on the side: If internal queues are 20 items long. How bulk insert works for elasticsearch? It is 40 items bulk?
No, the elasticsearch output has its own buffer. But yes, those messages are also in the jackpot if Logstash is shut down abruptly without being given an opportunity to shut down cleanly.
Yes, at least if Logstash is able to flush the pipeline. If an output is blocked then Logstash won't shut down, which is something you might need to deal with.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.