I'm using Ubuntu 16.04.3 LTS and want Logstash to read a static file and stop after that. I read you can monitor the sincedb file to know when Logstash has reached EOF (here).
I wrote a Python script to try to achieve this. Basically what it does is running Logstash using Python subprocess:
After that I use stat -c '%s' /path/to/data/file to get the total size of the data file in bytes (cast to int to compare using filter, etc) and set the current position to 0. Then I perform this loop:
while (current != filesize):
try:
current = int(filter(str.isdigit, subprocess.check_output("tail -1 " + sincedb + " | awk '{printf $4}'", shell=True))) #get the current position from the sincedb file
except (OSError, ValueError): #in case the file doesn't exist yet
pass
print("Done. It should stop now") #don't know what's the best way to stop it
I give some time between Logstash being launched and the beginning of the loop, that allowed to see that Logstash is not starting. The error is: "ERROR: Pipelines YAML file is empty", but when I launch Logstash from a terminal using the same command (sudo /path/to/logstash/bin/file '-f' /path/to/conf/file) it works perfectly fine.
Is there any other way to monitor the sincedb file while running Logstash? I tried using subprocess.call and Python threads but none of them worked.
So to understand the problem, you need to process a file using logstash and python.
Have you tried running it without the loop of sincedb?
Does the logstash process even start?
What version of python are you using?
When I use Popen I don't get the error, but it doesn't seem to load the data either. If I use the same conf file and run Logstash in the terminal, the data is correctly loaded, so I know the problem is not the conf file.
I downloaded just the tar.gz, do you think it may work if I download the deb version? I read that's the version I should use in Ubuntu 16.04.3, but considering the data was being loaded sometimes I didn't think of that as a possible solution.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.