How to control the filebeat read file sequence


(michaelshi) #1

The was log is stored in the order, but the order of the filebeat is changed, as shown below:

Time Message
March 10th 2016, 11:43:41.721 [16-2-3 0:01:04:356 CST] 00000026 SystemErr R log4j:WARN Please initia...
March 10th 2016, 11:43:41.721 [16-2-4 17:33:58:032 CST] 00000038 SystemErr R log4j:WARN No appende...
March 10th 2016, 11:43:41.721 [16-2-1 3:23:18:644 CST] 00000025 SystemErr R at com.ibm.ws.webcontai

Is it because of the filebeat issue, or the ES display?

Click message sort, also did not sort by time

Thanks


(Magnus Bäck) #2

ES only stores millisecond resolution and since all three messages have the same timestamp it can't sort them any better.


(Steffen Siering) #3

Can you check the 'offset' of indexed documents in elasticsearch?


(Aleksey Timchenko) #4

Hi All,

We have faced with the same issue, looks like it happens because filebeat sends bunch of logs as one bulk message to logstash, as result all of source messages will be with the same @timestamp and when you query them in Kibana they returns in random order because of sorting by @timestamp... which is equal for set of messages.

Is there any options to avoid such behaviour? I would be happy if beats could send messages one by one as their appear in file without merging them in to bulk inserts.

We've tried changing scan_frequency and bulk_max_size but it doesn't really help.
Any thoughts on this?

Regards.


(Magnus Bäck) #5

The @timestamp value shouldn't be the one added by Logstash. It should be picked up from the log message itself.


(michaelshi) #6

Thanks your reply.
Background of the problem is that I want to monitor tomcat log information. You know, there may be anomalies in the log message inside, and exception occurs when there is a stack of information. From the user point of view, it should be part of a message (as a whole), and if each message is divided randomly displayed for the user to read may be very inconvenient.
I use the following method to solve:

  1. I log output format has been adjusted, each line of the log header fixed identity, such as: 2016 and so on.
  2. Then filebeat profile use regular expressions inside the merge process
    Thank you.

------------------ 原始邮件 ------------------
发件人: "Magnus Bäck";noreply@elastic.co;
发送时间: 2016年4月14日(星期四) 晚上8:46
收件人: "szcountryboy"15932551@qq.com;

主题: [Beats/Filebeat] How to control the filebeat read file sequence

                                                                                                       magnusbaeck                 Magnus Bäck                 Logstash Plugins Community Maintainer               
           April 14                        

The @timestamp value shouldn't be the one added by Logstash. It should be picked up from the log message itself.

Visit Topic or reply to this email to respond

In Reply To
Alti Aleksey Timchenko
April 14

        Hi All,   We have faced with the same issue, looks like it happens because filebeat sends bunch of logs as one bulk message to logstash, as result all of source messages will be with the same @timestamp  and when you query them in Kibana they returns in random order because of sorting by @timestamp.…     

Visit Topic or reply to this email to respond

To stop receiving notifications for this particular topic, click here. To unsubscribe from these emails, change your user preferences


(Steffen Siering) #7

filebeat is assigning the timestamp when reading files. Normally one uses grok in logstash to parser the log messages and extract the timestamp from original log-messages. This gets you the correct order.

When dealing with exceptions, the preferred solution is to use multiline support in filebeat to combine message + full stack trace into one event (so message + full stack trace) can be read/indexed as one entity. => Proper use of multiline prevents a message being divided randomly...


(Aleksey Timchenko) #8

Thank you for response, it was very useful.

Previously we were using logstash-forwarder/nlog/nxlog and sorting by @timestamp generated by logstash and it's satisfied our needs.

Now, to meet Beats Approach, we just changed our elasticsearch index templates, modified date type of field which represents original log entity time (written by log source application) from string to "Date" and in Kibana switched to this, new, "Time-field name" in index Patterns


(system) #9