The problem is in the operation of the S3 module, the module starts for some time, everything works, but after a couple of hours the module is left with an error:
Error: Too many open files - Too many open files
May 25 06:06:07 logstash02.test logstash[143470]: Exception: Errno::EMFILE
May 25 06:06:07 logstash02.test logstash[143470]: Stack: org/jruby/RubyIO.java:1234:in `sysopen'
May 25 06:06:07 logstash02.test logstash[143470]: org/jruby/RubyIO.java:3774:in `read'
and no new data is received or processed. But this has a bad effect on the work of logstash.
I did not find a parameter in the documentation to limit the number of files or the time during which files are closed
This is more an OS issue than a logstash one, the simplest solution is to increase the number of files that Logstash process can open.
Per default Logstash have this in the systemd unit file.
LimitNOFILE=16384
You can increase the number to see if it helps.
Do your s3 bucket has a lot of files? Are the files small or large? The s3 input basically download the files, creates a temporarily file while it is processing and then remove the temporarily file, if you have a lot of files in the bucket this could result in a lot of temporarily files, and depending on those files, this could lead to a lot of open file descriptors.
You may try to change the interval, reduce it from 60 to 30 for example, so it will look at the bucket in smaller intervals and download less files each time.
[Unit]
Description=logstash
[Service]
Type=simple
User=logstash
Group=logstash
# Load env vars from /etc/default/ and /etc/sysconfig/ if they exist.
# Prefixing the path with '-' makes it try to load, but if the file doesn't
# exist, it continues onward.
EnvironmentFile=-/etc/default/logstash
EnvironmentFile=-/etc/sysconfig/logstash
ExecStart=/usr/share/logstash/bin/logstash "--path.settings" "/etc/logstash"
Restart=always
WorkingDirectory=/
Nice=19
LimitNOFILE=1048576
#LimitNOFILE=16384
# When stopping, how long to wait before giving up and sending SIGKILL?
# Keep in mind that SIGKILL on a process can cause data loss.
TimeoutStopSec=infinity
[Install]
WantedBy=multi-user.target
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.