Filebeat merged 2 "normal" lines


(Ilia Matveikin) #1

Hello.

In general, our application is writing one-line json log entity, but sometimes it`s multiline json, to merge multiline json i edited filebeat config as follows:

- input_type: log
  paths:
   - /var/log/application/eai-service.json
  json.message_key: message
  json.keys_under_root: true
  json.add_error_key: true
  json.overwrite_keys: true
  multiline:
    pattern: '^\{'
    negate: true
    match: after
  fields:
    tag: mp_eai
  fields_under_root: true

Example log:

{"@timestamp":"2018-07-18T17:15:02.082+04:00","@version":"1","message":"127.0.0.1 - - [2018-07-18T17:15:02.082+04:00] \"GET /admin/tasks?page=0&size=20&sort=created,desc HTTP/1.0\" 200 21655","method":"GET","protocol":"HTTP/1.0","status_code":200,"requested_url":"GET /admin/tasks?page=0&size=20&sort=created,desc HTTP/1.0","requested_uri":"/admin/tasks","remote_host":"127.0.0.1","content_length":21655,"elapsed_time":282}
{"@timestamp":"2018-07-18T17:15:04.269+04:00","@version":"1","message":"127.0.0.1 - - [2018-07-18T17:15:04.269+04:00] \"GET /admin/tasks?page=0&size=20&sort=created,desc HTTP/1.0\" 200 21655","method":"GET","protocol":"HTTP/1.0","status_code":200,"requested_url":"GET /admin/tasks?page=0&size=20&sort=created,desc HTTP/1.0","requested_uri":"/admin/tasks","remote_host":"127.0.0.1","content_length":21655,"elapsed_time":92}

As result, this two log entities has been merged into one, in elasticsearch it looks like:

{
  "_index": "mp_eai-6.3.0-2018.29",
  "_type": "doc",
  "_id": "416HrWQBchTaZXe-Fi1s",
  "_version": 1,
  "_score": null,
  "_source": {
    "content_length": 21655,
    "tag": "mp_eai",
    "requested_uri": "/admin/tasks",
    "status_code": 200,
    "remote_host": "127.0.0.1",
    "protocol": "HTTP/1.0",
    "source": "/var/log/application/eai-service.json",
    "method": "GET",
    "host": {
      "name": "appserver"
    },
    "beat": {
      "name": "server1",
      "hostname": "server1",
      "version": "6.3.0"
    },
    "elapsed_time": 92,
    "@timestamp": "2018-07-18T13:15:04.269Z",
    "offset": 3330035,
    "tags": [
      "beats_input_codec_plain_applied"
    ],
    "message": "127.0.0.1 - - [2018-07-18T17:15:02.082+04:00] \"GET /admin/tasks?page=0&size=20&sort=created,desc HTTP/1.0\" 200 21655\n127.0.0.1 - - [2018-07-18T17:15:04.269+04:00] \"GET /admin/tasks?page=0&size=20&sort=created,desc HTTP/1.0\" 200 21655",
    "@version": "1",
    "requested_url": "GET /admin/tasks?page=0&size=20&sort=created,desc HTTP/1.0"
  },
  "fields": {
    "@timestamp": [
      "2018-07-18T13:15:04.269Z"
    ]
  },
  "sort": [
    1531919704269
  ]
}

Logstash output:

output {

if [tag] == "mp_eai" {
  elasticsearch {
    hosts => ["elkserver1:9200", "elkserver2:9200", "elkserver3:9200"]
    user => logstash_internal
    password => testpassword
    sniffing => true
    manage_template => false
    index => "mp_eai-%{[@metadata][version]}-%{+xxxx.ww}"
}
}

  stdout {
      codec => rubydebug
          }
}

Could you please advise, whats wrong?


(Andrew Cholakian) #2

I'm not sure why the json options are working, but using the processor does.

For example:

filebeat.inputs:
- type: tcp
  host: "localhost:7070"
  multiline:
    pattern: '^\{'
    negate: true
    match: after
  fields:
    tag: mp_eai
  processors:
    - decode_json_fields:
        fields: ["message"]

and running head multi.json | nc localhost 7070 where multi.json includes your sample data.


(Ilia Matveikin) #3

Hi Andrew, Thanks for reply.

Actually, I`m wondering why the message fields have been merged.
Btw, now I cant reproduce the issue.

Could you please confirm that config I`ve shown will merge multiline, eg.:

{
		"@timestamp":"2018-07-18T17:15:04.269+04:00",
		"@version":"1",
		"message":"127.0.0.1 - - [2018-07-18T17:15:04.269+04:00] \"GET /admin/tasks?page=0&size=20&sort=created,desc HTTP/1.0\" 200 21655","method":"GET","protocol":"HTTP/1.0",
		"status_code":200,
		"requested_url":"GET /admin/tasks?page=0&size=20&sort=created,desc HTTP/1.0",
		"requested_uri":"/admin/tasks",
		"remote_host":
		"127.0.0.1",
		"content_length":21655,
		"elapsed_time":92
}

(ruflin) #4

The reasons the processor works and json does is the order of the execution. Multiline is applied after the json decoder and is part of the reader / harvester. All the processors are only applied after the data has been collected by the harvester so they are applied after multiline.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.