Filebeat can't see single line when timestamp is the same

Hi people!

I've been breaking my head on this problem I have with filebeat. The application log that needs to be send to Logstash logs 2 lines with the same timestamp. I've added an example below (removed IP/Ports):

Mar 22 12:41:03 localhost haproxy[17727]: Connect from ip:port to ip:port (fe_web-port/TCP)
Mar 22 12:41:03 localhost haproxy[17727]: Connect from ip:port to ip:port (fe_web-port/TCP)

When these lines are pushed to logstash -> elasticsearch they show up as a single event in the message field:

"source": "/var/log/haproxy/haproxy-traffic.log",
  "message": [
  "Mar 22 16:38:04 localhost haproxy[17727]: Connect from ip:port to ip:port 
  (fe_web-port/TCP)"
  ,
  "Connect from ip:port to ip:port (fe_web-port/TCP)"

These lines need to be seperated as multiple lines, but I am stuck on what config Filebeat needs to do this. So my question is:

Can Filebeat be configured that it will create multiple lines, even when the timestamps are the same?

Thanks in advance.

I have tried to reproduce this scenario and its working just fine.

Can you share your filebeat and logstash configs and versions?

Sorry for the late response, below you can find the configs:

Filebeat:

- type: log
  paths:
    - /var/log/haproxy/haproxy-events.log
  tags: ["haproxy","events"]
  fields_under_root: true
  fields:
    es_index: haproxy
- type: log
  paths:
    - /var/log/haproxy/haproxy-traffic.log
  tags: ["haproxy","traffic"]
  fields_under_root: true
  fields:
    es_index: haproxy

Logstash config:

filter {
  if "haproxy" in [tags] {
    if "events" in [tags] {
      grok {
      match => { "message" => "%{SYSLOGTIMESTAMP} %{SYSLOGHOST} %{SYSLOGPROG}: %{GREEDYDATA:message}" }
      }
    }
    if "traffic" in [tags] {
      grok {
      match => { "message" => "%{HAPROXYHTTP}" }
      match => { "message" => "%{SYSLOGTIMESTAMP} %{SYSLOGHOST} %{SYSLOGPROG}: %{GREEDYDATA:message}" }
      }
    }
  }
}

In the traffic part of haproxy there are log lines that are the same as in the events log, so thats why there is a similar grok filter in the tag for "traffic".

Both Filebeat and Logstash are running on version 6.1.3.

Hi @twan,

With this settings we can't see anything that could cause the duplicates to be removed. Are you using a custom event _id or a fingerprint in logstash?

Hi Adrian,

Thank you for your response. No I am not using an fingerprint or custom event_id's in logstash, but I quickly read your blogpost and maybe that is something that we need to consider implementing.

I'm gonna take a closer look at fingerprinting...

At this moment the duplicates are removed and are joined into the same message field. So maybe with fingerprinting I can set the duplicates to get their own field in ES.

Thanks!

I've found my problem. In one of the grok filters, I use a GREEDYDATA field with the name: "message"....

So it looked like the message was a duplicate, but it really was an extra piece (leftover) from the grok filter attached to the original message..... :sweat_smile:

I feel a bit stupid not noticing haha. Thanks again for the response!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.