Timestamps precision in beats


(George Sovetov) #1

What does Filebeat do about timestamps? Especially when "backpressure" feature is working.

More generally, how do I get synchronized timestamps?

On machines, time may misalign and, moreover, it may change while Filebeat is working. Therefore, it becomes impossible to see "surrounding" logs. On other hand, there is definitely possible to add some time synchronization logic.


(Jaime Soriano) #2

Hi @George_Sovetov,

Filebeat uses by default the time the log line was read, so if the read is delayed for any reason this time won't be accurate. But in addition to that, filebeat modules use to include processors that replace this timestamp with the time stored in the log itself, that should be as accurate as the system monitored. In general it should be enough if all the monitored systems have a synchronized time, what can be achieved using NTP.
Times are sent in UTC by default, what should prevent problems with daylight saving time changes and so on.

What is your use case? do you need more accurate time synchronization?


(George Sovetov) #3

Hi @jsoriano
I use VMs to test software and intentionally change system time on VMs as part of the tests. That's why I cannot rely on system time or use NTP. (Unusual, I know.)
Logstash/Elasticsearch servers have correct time.
Communication between VM and Logstash/Elasticsearch servers is quite quick.
Does that mean that the best I can do is to use timestamp added by Logstash?


(Jaime Soriano) #4

Yes, then I guess that then the best you can do is to add a timestamp in logstash, but take into account that this will be the time when the log line was processed by logstash, that will surely contain some delay.


(George Sovetov) #5

BTW, when file is read, is it done using poll/select with timeouts or file just checked on regular basis (every second)? In second case I'll get extra delay up to checking period...


(Jaime Soriano) #6

Once haversting of a file has started, it is read by default as soon as there is available content, more in a select fashion. close_inactive can be used to close inactive files so there are not so many files open at the same time.
For new files there can be some delay, filebeat checks by default every 10 seconds for new files, this can be customized with scan_frequency setting.
Also, once the logs are read they are sent to logstash in batches, so there can be an additional delay till a batch is ready to be sent, there are some settings to control that, like bulk_max_size.


(George Sovetov) #7

It's very clear now. Thank you a lot, @jsoriano!


(George Sovetov) #8

Hi @jsoriano,

I'm currently implementing what we've discussed here above and got a problem. Logstash takes @timestamp field from Filebeat, and I couldn't find the way to make reset @timestamp field in event received from Filebeat. Is it possible to use time from Logstash machine with beats?


(Jaime Soriano) #9

Hi @George_Sovetov,

Take a look to this thread. As the events will probably have already a @timestamp logstash won't set it to its current time, but you can use the ruby scripting filter to do whatever you want.


(George Sovetov) #10

Will it have any performance implications?


(George Sovetov) #11

What I ended up with now is:

  if [@metadata][beat] == "filebeat" {
    ruby {
      code => "event.set('logstash_processed_at', Time.now());"
    }
  }

Found it here: Add current timestamp to a logstash_processed_at field