Logstash does not pick up all gzip log files

Hi community, this is my first post in this forum and I would like to seek help regarding Logstash.

So here's the scenario:

I have a Python script to pull logs from Cloudflare ELS and then save it into local folder as gzip format. Then, I configured Logstash to listen to that folder and pick up any new log files.

Refer below for the Logstash.conf:

input {
    file {
        path => "/var/log/cf_logs/*.gz"
        mode => "read"
        file_completed_action => "log_and_delete"
        file_completed_log_path => "/var/log/logstash/logstash.log"
    }
}

filter {
    json {
        source => "message"
    }
    mutate {
        remove_field => [ "message", "path", "host", "@version" ]
    }
}

output {
    elasticsearch {
        user => "${LOGSTASH_PUSH_USERNAME}"
        password => "${LOGSTASH_PUSH_PASSWORD}"
        index => "cloudflare"
        pipeline => "cloudflare-pipeline-daily"
        cacert => "${CERTS_DIR}/ca/ca.crt"
        hosts => "https://${ES03_NAME}:9200"
        http_compression => true
        ilm_enabled => false
    }
}

And this is the format of the log filename, something like this:

cf_logs_2021-03-26T06_01_00Z~2021-03-26T06_01_10Z.json.gz
cf_logs_2021-03-26T06_01_10Z~2021-03-26T06_01_20Z.json.gz
cf_logs_2021-03-26T06_01_20Z~2021-03-26T06_01_30Z.json.gz
cf_logs_2021-03-26T06_01_30Z~2021-03-26T06_01_40Z.json.gz
cf_logs_2021-03-26T06_01_40Z~2021-03-26T06_01_50Z.json.gz
cf_logs_2021-03-26T06_01_50Z~2021-03-26T06_02_00Z.json.gz

While Logstash is running, I do notice that Logstash does not actually pick up every file and process it, instead the file is still in the folder. The file should be deleted automatically once processed, but in this case, it's not. Out of 10 files, there might be 5 files not processed by Logstash. Moreover, the Logstash event log does not display any error regarding this issue.

I do not encounter this kind of issue if the logs are not compressed (purely in JSON format). So this is weird that Logstash can't process gzip logs properly.

Are there any tricks that I can prevent this kind of issue from happening? Or is it a bug?

Thanks in advance.

1 Like

It might be a problem with inode re-use. There are several open issues around that.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.