Messages from different files are combined as one document

Hi,

I'm trying to ingest log files to elasticsearch through filebeat, the flow is like this: log file -> filebeat -> logstash (indexer) -> elasticsearch.

filebeat is setup to monitor a few files from the same folder, the path is configured like e:/log/application1/*.log

there is multiline filter configured in logstash indexer as below
multiline {
pattern => "^%{MONTHDAY}-%{MONTH}-%{YEAR} %{TIME}"
what => "previous"
negate => true
}

and the sample log file is something like this
15-Feb-2016 01:42:27:907 message 1
15-Feb-2016 01:42:28:048 message 2
15-Feb-2016 01:42:28:204 message 3
continued message 3 (not started with timestamp)
continued message 3 (not started with timestamp)
.... (can range from a few line to a few thousand lines for the same message)
15-Feb-2016 01:42:36:518 message 4

what happened is that messages from different files (different time) are put together as one document (view from kibana), something like below:
15-Feb-2016 01:02:28:048 message x (from file 1)
partial message line y (from file 2, line not started with timestamp)
continued partial message line y (from file 2, line not started with timestamp)

I don't observe this problem when I try to ingest through logstash: log file -> logstash shipper -> logstash indexer -> elasticsearch with the same indexer configuration, so i categorized this problem under filebeat :stuck_out_tongue:

Anyone have any ideas what caused this problem? Thanks a lot!

I would recommend upgrading to the latest version of Filebeat, which contains support for multiline processing. This will allow you to perform the multiline processing directly at the source, which will simplify your Logstash configuration and remove the need for the Logstash multiline filter, which is being deprecated.

Thanks a lot! It's working once I moved the multiline configuration to filebeat :slightly_smiling: