High-Speed Log Events Not Fully Captured by Logstash File Input

  file {
    path => "/mnt/efs-logstash/application-logs/log_workers/production_*.log"
    start_position => "beginning"
    sincedb_path => "/var/lib/logstash/sincedb_logger"
    type => "json"
    stat_interval => "200 ms"
    ecs_compatibility => disabled
    tags => ["worker_logger_input"]
  }

I have a scenario where I’m using the file input plugin to read log files generated by the application. However, I’ve noticed that some events are not being read by Logstash v8.19.

I’ve implemented a mechanism to log the original event in order to capture all processed records and ensure the issue is not in the HTTP output. According to metrics, there’s no data loss, and the average throughput is 1,500 events per second. The instance has 4 CPUs, and the worker count is set to 4.

I’ve tried several configurations to increase the frequency of file change detection, but no improvement was observed. One relevant characteristic is that events occur very rapidly — for example, the application writes a JSON line indicating the START of a process, and immediately afterward, the same OS PID logs that the process has been completed, with only a few milliseconds between them.

Could this be a race condition issue, or is there some configuration I might have overlooked?

I'm using AWS EFS, where approximately 20 EC2 instances do the recording and it's a file per instance separated by day, month and year

Hello and welcome,

This is NFS, right?

This may be the issue, the file input may have issues reading from NFS since reading from remote network volumes is not very well tested.

I don't think there is much you can do here, you would probably need to change the way you collect logs, like having filebeat collecting the logs directly from the EC2 disk, also avoiding reading from NFS as it is not recommended for Filebeat either.