How can I figure out which file is currently processed by Logstash, and at which position Logstash actually is.
I have configured 5 files for input, and usually its working fine. But since hours no one writes to those files - but still I see some Logstash workers using considerable amount of CPU. So I'd like to figure out WHAT Logstash actually does at the moment.
The scripts in the stackoverflow post wont work for me.
I have some sincedb files with the same inode in it. And with different offset values:
.sincedb_26675a6da459b5bb8f0ec1de6ad0b97f:538586 0 2049 26078748
.sincedb_e4d8738c2308107cbfedd0dfa66b800b:538586 0 2049 100350
Is this OK?
Ty the way: The script from stackoverflow returns this error:
join: /var/lib/logstash/plugins/inputs/file/.sincedb_26675a6da459b5bb8f0ec1de6ad0b97f:4: is not sorted: 543354 0 2049 3033
By default the sincedb files are named based on a hash of the filename pattern, so you've probably used different filename patterns that look at the same file. You should probably use the sincedb_path option to explicitly select the path to the sincedb file so you know which entry to use.
I have extended that StackOverflow script quite a bit by obtaining the current position that Logstash has read from by using 'lsof'.
My script can be found here: Logstash Progress - Pastebin.com
You'll have to modify the FILES_TO_BE_PARSED and SINCE_DB_FILES variables to point to the correct locations/format that your files are stored.
And example output of my script would look something like:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.