Hi community, this is my first post in this forum and I would like to seek help regarding Logstash.
So here's the scenario:
I have a Python script to pull logs from Cloudflare ELS and then save it into local folder as gzip format. Then, I configured Logstash to listen to that folder and pick up any new log files.
Refer below for the Logstash.conf:
input {
file {
path => "/var/log/cf_logs/*.gz"
mode => "read"
file_completed_action => "log_and_delete"
file_completed_log_path => "/var/log/logstash/logstash.log"
}
}
filter {
json {
source => "message"
}
mutate {
remove_field => [ "message", "path", "host", "@version" ]
}
}
output {
elasticsearch {
user => "${LOGSTASH_PUSH_USERNAME}"
password => "${LOGSTASH_PUSH_PASSWORD}"
index => "cloudflare"
pipeline => "cloudflare-pipeline-daily"
cacert => "${CERTS_DIR}/ca/ca.crt"
hosts => "https://${ES03_NAME}:9200"
http_compression => true
ilm_enabled => false
}
}
And this is the format of the log filename, something like this:
cf_logs_2021-03-26T06_01_00Z~2021-03-26T06_01_10Z.json.gz
cf_logs_2021-03-26T06_01_10Z~2021-03-26T06_01_20Z.json.gz
cf_logs_2021-03-26T06_01_20Z~2021-03-26T06_01_30Z.json.gz
cf_logs_2021-03-26T06_01_30Z~2021-03-26T06_01_40Z.json.gz
cf_logs_2021-03-26T06_01_40Z~2021-03-26T06_01_50Z.json.gz
cf_logs_2021-03-26T06_01_50Z~2021-03-26T06_02_00Z.json.gz
While Logstash is running, I do notice that Logstash does not actually pick up every file and process it, instead the file is still in the folder. The file should be deleted automatically once processed, but in this case, it's not. Out of 10 files, there might be 5 files not processed by Logstash. Moreover, the Logstash event log does not display any error regarding this issue.
I do not encounter this kind of issue if the logs are not compressed (purely in JSON format). So this is weird that Logstash can't process gzip logs properly.
Are there any tricks that I can prevent this kind of issue from happening? Or is it a bug?
Thanks in advance.