Input file plugin doesn't process multiple input files on intensive logging in any of them


(Vkakhnych) #1

Tested on logstash 2.3, centos6.
Simply logstash config:


input {
  file {
    path => [ "/tmp/*-running.log" ]
  }
}

output {
  stdout { codec => json_lines }
}

And simply logs popolate script:


#!/bin/sh

while true
do
  DATE=$(/bin/date +'%Y-%m-%d@%T')
  echo "$DATE batch1" >> /tmp/Test1-running.log 
  echo "$DATE batch2" >> /tmp/Test2-running.log
#  sleep 1
done

I've expected to got in the output lines from both logs in serial order but get content only from one of them:
...
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.216Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.219Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.219Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.221Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.221Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.225Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.225Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.227Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.227Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.229Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.229Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.230Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
{"message":"2017-10-13@09:24:40 batch2","@version":"1","@timestamp":"2017-10-13T09:26:13.230Z","path":"/tmp/Test2-running.log","host":"localhost.localdomain"}
...
When tail -f /tmp/*-running.log output content from the both logs fine at the same time.

But if I remove a comment from sleep 1 in the above script (logging stands not so intensively) I get expected results from file plugin. Unfortunately in real life we have much more logs with very high intensively of logging.
Is there any way to get expected results from file plugin?
We can't use filrebeat because of using AWS SQS for output.

Thanks a lot!


(Magnus B├Ąck) #2

Yeah, I think each file input reads the first file until it hits EOF, then it continues with the next and the next and the next, finally circling back to the first one. Would it be possible for you to generate multiple file input blocks instead of using an all-encompassing wildcard? That way you'll get multiple threads processing the input files.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.