Duplicate Entries for multiline log processed by Logstash

Hello,

I'm new to ELK usage and I would greatly appreciate feedback on the following problem that has been driving me crazy for the last couple of days.
I have a Filebeat that is shipping logs to Logstash, including multiline entries, configured like this:

...
filebeat.prospectors:
- input_type: log
  paths:
    - /var/log/*.log
  multiline.pattern: '^\['
  multiline.negate: true
  multiline.match: after
  multiline.max_lines: 100
  multiline.timeout: 5
...

The contents of the log file is as follows:

[2017-11-21 15:46:18,1511279178] [192.168.1.1] [INFO] [@service:dopo-01, @envtype:0] 11111111111111111111111
[2017-11-21 15:46:18,1511279178] [192.168.1.1] [INFO] [@service:dopo-00, @envtype:0] Multine #1.1
Multiline #1.2
Multiline #1.3
[2017-11-21 15:46:18,1511279178] [192.168.1.1] [INFO] [@service:dopo-01, @envtype:0] 2222222222222222222222
[2017-11-21 15:46:18,1511279178] [192.168.1.1] [INFO] [@service:dopo-02, @envtype:0] Multine #2.1
Multiline #2.2
Multiline #2.3
[2017-11-21 15:46:18,1511279178] [192.168.1.1] [INFO] [@service:dopo-01, @envtype:0] 3333333333333333333333

The Filebeat logs show the following:

2017/11/23 15:50:14.638786 client.go:214: DBG  Publish: {
  "@timestamp": "2017-11-23T15:50:09.776Z",
  "beat": {
    "hostname": "moby",
    "name": "moby",
    "version": "5.6.4"
  },
  "input_type": "log",
  "message": "[2017-11-21 15:46:18,1511279178] [192.168.1.1] [INFO] [@service:dopo-01, @envtype:0] 11111111111111111111111",
  "offset": 4004,
  "source": "/opt/loqr/logs/new1.log",
  "type": "log"
}
2017/11/23 15:50:14.639906 client.go:214: DBG  Publish: {
  "@timestamp": "2017-11-23T15:50:09.776Z",
  "beat": {
    "hostname": "moby",
    "name": "moby",
    "version": "5.6.4"
  },
  "input_type": "log",
  "message": "[2017-11-21 15:46:18,1511279178] [192.168.1.1] [INFO] [@service:dopo-00, @envtype:0] Multine #1.1\nMultiline #1.2\nMultiline #1.3",
  "offset": 4132,
  "source": "/opt/loqr/logs/new1.log",
  "type": "log"
}
2017/11/23 15:50:14.647042 client.go:214: DBG  Publish: {
  "@timestamp": "2017-11-23T15:50:09.776Z",
  "beat": {
    "hostname": "moby",
    "name": "moby",
    "version": "5.6.4"
  },
  "input_type": "log",
  "message": "[2017-11-21 15:46:18,1511279178] [192.168.1.1] [INFO] [@service:dopo-01, @envtype:0] 2222222222222222222222",
  "offset": 4240,
  "source": "/opt/loqr/logs/new1.log",
  "type": "log"
}
2017/11/23 15:50:14.650747 client.go:214: DBG  Publish: {
  "@timestamp": "2017-11-23T15:50:09.776Z",
  "beat": {
    "hostname": "moby",
    "name": "moby",
    "version": "5.6.4"
  },
  "input_type": "log",
  "message": "[2017-11-21 15:46:18,1511279178] [192.168.1.1] [INFO] [@service:dopo-02, @envtype:0] Multine #2.1\nMultiline #2.2\nMultiline #2.3",
  "offset": 4368,
  "source": "/opt/loqr/logs/new1.log",
  "type": "log"
}

My Logstash is configured as follows:

input {
  beats {
    port => 5044
  }
}
filter {
  grok {
    patterns_dir => ["/etc/logstash/conf.d/patterns"]
    match => { "message" => "\[%{LOCAL_TIMESTAMP:lts}\] \[%{IPADDRESS:hostIP}\] \[%{CATEGORY:category}\] \[%{METAINFO:metainfo}\] %{MESSAGEMULTI:message}" }
    overwrite => [ "message" ]
  }
}

With the following patterns:

LOCAL_TIMESTAMP [^\]]+
IPADDRESS [^\]]+
CATEGORY [^\]]+
METAINFO [^\]]+
MESSAGE .*$
MESSAGEMULTI (.|\r|\n)*$

Though this correctly parses/ships my multiline log entries, it generates one duplicate entry for each line of the multiline logs, as shown in this Kibana screenshot

Can someone please give me a hint about how I can avoid this duplication ?

Regards, Pedro Borges

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.