Logstash lost data when using file input plugin

Hi all,

I use the logstash file input plugin, and data lost occur.

Every 1 hour, download log files to the path ("/path/to/.../.log."), and I use the logstash file input plugin to tail them.

When Logstash started, it could read files normally, but few hours later, logstash missing read data, like this graph:


I don't know what situation cause this, it's normally on my old elastic stack on Windows server (1 es node), like following,


but the new elastic stack on the Centos server (3 es nodes), become abnormal.

The difference between old and new elastic stack environment:

  1. OS, the old one is Windows, and new one is Centos.

  2. Elastic stack version, old: 6.7.2; new: 7.4.2

  3. The Elasticsearch node nimber, old: single node; new: 3 nodes

  4. In the old environment, download log files to the path on Windows server by day (30 1 * * *); in new environment on Centos server, by every hours (30 * * * *).

  5. Used the "index lifecycle management" to house keeping the log data on new environment.

Here is my Logstash conf:

input {
  file { 
    type => 'log'
    mode => "tail"
    path => "/path/to/.../*.log.*" 
    codec => multiline {
      pattern => "^\d\d\d\d\-\d\d"
      negate => true
      what => "previous"
      auto_flush_interval => 1
    }
    start_position => "beginning"
    ignore_older => "12 h"
    sincedb_clean_after => 1
  }
}

filter {
  if [type] == 'log'
  {
    if [message] == "" { drop{} }
    mutate {
      ...
    }
    csv {
      ...
    }
    grok{
     ...
    }
    kv {
      ...
    }
    date {
      ...
    }
    ruby {
     ...
    }
    jdbc_static {
    ...
    }
    if "_dateparsefailure" in [tags] { drop{ } }
}

output{
  if [type] == 'log'
  {
    elasticsearch {
      hosts => ["es1", "es2", "es3"]
      index => "log-%{+YYYY.MM.dd}"
      ilm_rollover_alias => "log"
      ilm_policy => "log-house-keeping"
    }
  }
}

The sindb file generated by logstash file input plugin: