Hello
I have a daily process that downloads some compressed log files, and then extracts them into a directory logstash is watching
weblogs pipeline;
input {
file {
id => "weblogs_input_file"
path => "/data/logs/web/logs/*.log"
codec => plain {
charset => "ISO-8859-1"
}
}
}
filters { ... }
output {
if "_grokparsefailure" in [tags] {
elasticsearch {
id => "weblogs_output_skipped_lines"
index => "skipped-logs-2"
}
} else {
elasticsearch {
id => "weblogs_output_elastic"
# default index
}
}
}
Today, logstash decided to only ingest the later half of a file (the other 11 processed fine)
It doesn't appear to be a grok failure as they aren't indexed in the skipped-logs-2 index.
The only other thing I can think might be the cause is the new log file somehow matched something already in the File Sincedb? Although I'm not sure how to prove or disprove that..?
The file /var/log/logstash/logstash-plain.log just contains cidr errors, eg something like
[2018-09-05T04:01:44,805][WARN ][logstash.filters.cidr ] Invalid IP address, skipping {:address=>"%{clientip}".....
There are only 61 of those warnings, I'm missing 1,000s of log lines.
I also made a copy of the file (eg cp file.log newfile.log
), and logstash happily reingested the entire file, so I don't think it was anything in particular in this specific file that caused the issue.
Anyone have any suggestions on what might have caused this, and how I can stop it in the future?
Thanks