Filebeat - difference in beat_timestamp and @timestamp

HI Guys, first, let me explain my setup.

Using filebeat, Kibana and Elasticsearch 6.3.x.

Windows Server (filebeat and iis module & metricsbeat) -> Logstash Server -> ElasticSearch <- Kibana

Configs:

Filebeat.yml:

#=========================== Filebeat inputs =============================

filebeat.inputs:

- type: log
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - L:\Logs\L*\*\*.log
    - L:\Logs\L*\*\*\*.log


  multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}\s[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}\s\['
  multiline.negate: true
  multiline.match: after

  exclude_files: ['^_Error-(.*)+']

  tags: platform-logs

#============================= Filebeat modules ===============================

filebeat.config.modules:

  path: ${path.config}/modules.d/*.yml

  reload.enabled: true

  reload.period: 60s

#==================== Elasticsearch template setting ==========================

setup.template.settings:
  index.number_of_shards: 1
  index.codec: best_compression
  #_source.enabled: false

#================================ General =====================================

name: APP-01-LON

#================================ Outputs =====================================

# Configure what output to use when sending the data collected by the beat.

#----------------------------- Logstash output --------------------------------
output.logstash:
  hosts: ["log1:5001","log2:5001", "log3:5001"]

Logstash input conf:

input {
  beats {
    port => 5001
  }
}


filter {
 #platform logs .net
 if "platform-logs" in [tags] {

      mutate {
         copy => { "@timestamp" => "beat_timestamp" }
      }

      grok {
            match => {
            "message" => [
               "%{GREEDYDATA:time} \[%{DATA:if.platform}\] \[%{DATA:if.app}\] \[%{DATA:if.correlationid}\] \[%{DATA:if.threadid}\] \[%{DATA:if.loglevel}\] \[%{DATA:if.logsource}\] - %{GREEDYDATA:if.logmessage}"

               ]
             }
      remove_field => [ "message" ]
      }

      date {
        match => [ "time", "yyyy-MM-dd HH:mm:ss,SSS" ]
        remove_field => [ "time" ]
      }

 }

else if "syslog-logs" in [tags] {

   grok {
          match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
          add_field => [ "received_at", "%{@timestamp}" ]
          add_field => [ "received_from", "%{host}" ]
      }
   mutate {
         add_field => { "read_timestamp" => "%{@timestamp}" }
      }

   date {
         match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
   }
 }
 # nodejs logs (gel)
 else if "nodejs-logs" in [tags] {
  mutate {
     copy => { "@timestamp" => "beat_timestamp" }
     rename => { "correlationID" => "if.correlationid" }
     rename => { "threadID" => "if.threadid" }
     rename => { "platformKey" => "if.platform" }
     rename => { "level" => "if.loglevel" }
     rename => { "message" => "if.logmessage" }
     rename => { "source" => "if.logsource" }
     rename => { "app" => "if.app" }

  }

  date {
    match => [ "timestamp", "ISO8601" ]
    remove_field => [ "timestamp" ]
  }
 }
  #metric logs .net
   else if "metricbeat" in [tags] {

   }
}

output {

if "nodejs-logs" in [tags] {

  elasticsearch {
    hosts => ["els-ip1", "els-ip2", "els-ip3"]
    index => "nodejs-%{+YYYY.MM.dd}"
    manage_template => false
    user => logstash_writer
    password => ''
  }
 }
else if "iis-logs" in [tags] {

   elasticsearch {
	 hosts => ["els-ip1", "els-ip2", "els-ip3]
     index => "filebeat-%{+YYYY.MM.dd}"
     manage_template => false
     user => logstash_writer
     password => ''
   }
  }
else if "platform-logs" in [tags] {

   elasticsearch {
     hosts => ["els-ip1", "els-ip2", "els-ip3]
     index => "platform-%{+YYYY.MM.dd}"
     manage_template => false
     user => logstash_writer
     password => ''
   }
  }
else if "syslog-logs" in [tags] {

   elasticsearch {
     hosts => ["els-ip1", "els-ip2", "els-ip3]
     index => "syslog-%{+YYYY.MM.dd}"
     manage_template => false
     user => logstash_writer
     password => ''
   }
  }
else if "metricbeat" in [tags] {
   elasticsearch {
     hosts => ["els-ip1", "els-ip2", "els-ip3]
     index => "metricbeat-%{+YYYY.MM.dd}"
     manage_template => false
     user => logstash_writer
     password => ''
   }
  }
}

All machines are on a vmware hypervisor, with the following specs:

WindowsAppServer: 8 vcpus, 8gb ram (winsrv 2016)
Logstash servers: 16gb ram, 8gb jvm heap size, 8 vcpus (centos 7.5)
es servers: 64GB ram, 32gb jvm heap, 12 vcpus (centos 7.5)

All machines connect to a SAN (dell compellent) over fibre/10gig nic cards

using Xpack with monitoring and security.

We currently have 155 Indices, 846 millon documents indexed, 719GB disk usage, 347 primary shards, 348 second shards.

Indexes are created on a daily basis filebeat-(%YYYY.MM.DD), platform-* etc...

3 Logstash nodes, 3 ES nodes (with kibana server acting as a node for load balanced requests to ES cluster, not data, ingest or master node).

There are 43 active beats, running metricbeat and filebeat. and using Persistent queues in logstash.

Phew, I think i've given a good overview.

Now to the problem, the consultant that Elastic came to help us set things up suggested we have a beat_timestamp field to show when filebeat actually picked up a log entry and the @timestamp field would be the log entries' time so we could see if there was a 'drift' between the log being written to disk on the windows server and it being picked up by filebeat.

Today, I've noticed that this time 'drift' can be as much as 45 minutes out but i'm not sure why. I believe we have enough 'grunt' but unsure if there are some filebeat settings I can change to help fix this..

Our indexes are set to 1 primary shard and 1 replica shard, as per the consultants suggestion.

Example of issue: (I've had to obfusicate some fields but should hopefully be enough to see my issue)

Let me know if you need any more information, but hopefully i've covered everything.. I'm seeing this drift across the board but the drift is bigger on the busiest windows servers (the ones with more log entries written per second)..

Maybe it's timezone problem

All machines run on UTC - plus the drift differs during the day

Based on the example event you provided it seems like Filebeat picks up the event ~20 minutes later before it was written by the application. Do you have some kind of backpressure? How many events per seconds are written to the disk which has to be processed by Filebeat?

Hi

Ok, so during our peak times on average:

we have 43 beats (filebeat and metricbeat), doing on average of 40 e/s each with one beat doing around 100 e/s avg across 3 logstash servers..

As a rough estimate I'd say there are around 80 events per second written to disk on each machine that has filebeat running.

Thanks,

Michael

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.