I have logstash set up to parse/ingest a local directory on my server.
The directory is for a CDN log files which are fetched every 10 minutes.
Unfortuantly the CDN produces a new log file multiple times a minutes ( so for example I have over 1000 log files for 1am-8am today).
The logs are never appended too, so just need to be read and then forgotten about.
Logstash seems to be struggling to parse these.
if i go into the directoy and run "zcat *.log.gz | wc -l it shows 103,884 lines, yet only 11,620 hits are showing in Kibana for today.
I would expect kibana to show 103,884 lines.
Looking in the file_completed_log it does seem to be missing quite a few out.
My config file for input is below -
path => "/data/logs/*.log.gz"
sincedb_path => "/data/logstash-db/sincedb"
mode => "read"
file_completed_action => "log"
file_completed_log_path => "/data/logstash-db/file_completed_log"
The log files are called - cds_20190522-154421-57378698007ch4.log.gz (ect ect)
Can anyone think of why this could be happening? Is logstash known for struggling with lots of small files?