I have a requirement to pick log lines from log based on some start word, lets assume 'ABC' and combine all following lines until 'XYZ' appears in log file. If XYZ found, combine all log lines and ship them to logstash as one request, if XYZ not found then filebeat should wait and keep appending coming log lines to earlier one unless 'XYZ' found or flush timeout happens. To test this scenario i am using some sample file with below input
I am expecting two documents to be inserted into elasticsearch
While third document should not be inserted as end matching pattern 'XYZ' is not yet added in file after 10 value in abv sample.
However, when i run filebeat, I can see three documents inserted with the third document not having XYZ at the end it is still got flushed and pushed to elasticsearch.
- type: filestream
I see similar questions asked earlier as well but they are unanswered till date. Below are the links.
I had tried adding multiline.timeout: 50s, which makes it wait before pushing third document with message ABC 9 10 unless timeout happens, but it create another issues, if i add another entry into log before timeout lets say added (11 12 XYZ) then filebeat insert 11 and 12 as seperate document and ABC 9 10 11 12 XYZ message as another document after timeout.
Any suggestion on how to effectively read all log lines which are coming between two specific patterns while there is a possibility of long wait time before file is appended with matching closing pattern ?