How to handle json-formated data in input channel in connection with multiline?

Hi all,

I am currently trying to apply the multiline codec (with particular regards to Java stack traces) in the input tcp stream in logstash on events (logs) that are provided to logstash by the logspout-logstash tool (the latter providing logs from docker containers). The tricky thing is that the input that is provided to logstash by logspout is already in JSON format and at the same time contains multiline logs that would have to be united by multiline in logstash.

What I normally would have to do in order to handle this situation in logstash (as per my understanding of logstash) is to process the input from logspout first by the JSON codec in logstash in order to get the JSON input translated into corresponding fields in logstash and then apply the multiline codec to the result in order to merge the multiple log lines that belong to one logical log event into one logstash event.

However, as far as I can tell, it is not possible to apply two codecs in the input tcp stream of logstash. Therefore, I can only apply either "codec json" or "codec multiline" within the input stream. And since multiline is no longer available in the logstash filter section, it has to be placed in the input stream if it needs to be used. Therefore I cannot apply the json codec in the input stream.

If I try to apply the json filter in the filter section after applying the multiline codec in the input section, the json filter is unable to process the input of the events that have been multilined ("Error parsing json").

This is an example of the input logstash gets from the logspout-logstash tool (a single log line):

2018-09-11T07:16:23.792Z 192.168.0.3 {"Environment":"int","Instance":"myApp-core-abcd","docker":{"name":"/myapp","id":"73237eb27d452c6db1e1eaaf07","image":"192.168.0.138:6000/myapp-int:latest","hostname":"7d45","labels":{"Environment":"int","Instance":"myApp-core-abcd","build-date":"20180311","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":" DEBUG [20180911 09:16:23] - 83787 SessionMgrBean.getConnectionProfileForClientType started","stream":"stdout","tags":[]}

(Except for the timestamp/date at the beginning of that event(?), the real content (the one that contains the actual log output from the application and which does not remain constant throughout the various events) is the one that is introduced by the name "message":"[actual log content]".)

This is my current, small logstash config for debugging purposes:

{ 
  tcp 
  { 
    port => 5000 
    type => "backend"
    codec => multiline 
    {
      pattern => "^.*? %{LOGLEVEL} +\["
      negate => true
      what => "previous"
    } #codec
  } #tcp
} #input

And this is what the multiline codec produces for multiline log lines that logstash receives from logspout which match the multiline pattern in the logstash config (rubydebug-formated):

      "@version" => "1",
          "host" => "192.168.0.3",
       "message" => "{\"Environment\":\"int\",\"Instance\":\"myApp-core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"cc3032\",\"image\":\"192.168.0.138:6000/myapp-int:latest\",\"hostname\":\"7d45\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myApp-core-abcd\",\"build-date\":\"20180311\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\" INFO  [20180913 15:25:50] - 50511 Blah.getData: executing SELECT yadada blah blah and so on fictional example for the actual log contentXYZ\",\"stream\":\"stdout\",\"tags\":[]}\n{\"Environment\":\"int\",\"Instance\":\"myApp-core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"cc3032\",\"image\":\" 192.168.0.138:6000/myapp-int:latest\",\"hostname\":\"7d45\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myApp-core-abcd\",\"build-date\":\"20180311\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\"\\tFROM abcdexception abcdexception0\",\"stream\":\"stdout\",\"tags\":[]}\n{\"Environment\":\"int\",\"Instance\":\"myApp-core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"cc3032\",\"image\":\" 192.168.0.138:6000/myapp-int:latest\",\"hostname\":\"7d45\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myApp-core-abcd\",\"build-date\":\"20180311\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\"\\tWHERE abcdexception0.abcdexception_id = ?\",\"stream\":\"stdout\",\"tags\":[]}",
          "port" => 45754,
    "@timestamp" => 2018-09-13T13:25:50.569Z,
          "tags" => [
        [0] "multiline"
    ],
          "type" => "backend"
}

(And this multiline is just a very small example of a multiline. Normally, the multiline event is much bigger when a stack trace appears in the log.)

Assuming that I cannot change the format of the log input that logstash receives from logspout, how should I best handle that json-formated input from logspout in order to have a clean json mapping in logstash AND do multiline as well in logstash?

Thanks,
Kaspar

The question is very much about the multiline codec, but the single example input you give is single-line. Do you have any more examples?

If your input were all single-line like the one posted, my approach would be to (a) use the lines codec, producing one event per line, (b) use a Grok or Dissect filter to extract the top-level information (date, ip, json), optionally setting the event's @timestamp with the Date filter, and then use the JSON filter to parse the JSON body:

input {
  tcp {
    # ...
    codec => lines
  }
}
filter {
  dissect {
    mapping => {
      "message" => "%{date} %{source_ip} %{[@metadata][payload]}"
    }
  }
  date {
    match => ["date", "ISO8601"]
  }
  json {
    source => "[@metadata][payload]"
  }
}

If your input truly is multiline, I would do very similarly, except I would work to ensure that my multiline codec's pattern directive could identify either a new log message or a log message continuation as early as possible in the string, without any wildcards.

[Note: Due to the body character limit I am splitting my reply up into 2 or 3 replies...]

[Part 1]
Hello Ry,
Thanks for your feedback!

The example output I have posted is the result of the multiline codec processing several single input lines that logically belong together. Input-wise I get both types of logs: single lines where every single line is also logically a single line but also single lines that logically spawn over several single lines and therefore must be processed by the multiline codec (e.g. stack traces).

For this reason, based on my understanding, I have to use the multiline codec and since the only place where it can be used is in the input stream, that's where I have to place it (instead of the "line" codec).

Here's another example that illustrates better what logstash gets provided in the input stream and what it makes out of it based on the used codecs. The example spawns over several slightly different logstash configs but the input that logstash gets is always the same (except for the timestamps) -- i.e. the log input is based on exactly the same activity in the application, only the point in time differs.

Example 1: no processing in the input stream, output only applies "codec => line":
logstash.conf:

input { tcp { port => 5000 type => "backend" } } output { stdout { codec => line } }

Logstash output = 5 single lines that logically belong together (one logical log message -- which can be seen by looking at the content of the "message":[...] part):

2018-09-17T10:49:59.775Z 192.168.0.3 {"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":" INFO  [20180917 12:49:59] - 99772 BlaTestMo.getData: executing SELECT profile.id, profile.name, profile.ctype, profile.ison, profile.isdef, profile.ts","stream":"stdout","tags":[]}

2018-09-17T10:49:59.775Z 192.168.0.3 {"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\tFROM profile profile0","stream":"stdout","tags":[]}

2018-09-17T10:49:59.775Z 192.168.0.3 {"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\tWHERE profile0.ctype_id = ?","stream":"stdout","tags":[]}

2018-09-17T10:49:59.775Z 192.168.0.3 {"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\t AND profile0.ison = ?","stream":"stdout","tags":[]}

2018-09-17T10:49:59.776Z 192.168.0.3 {"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\t AND profile0.id = ?","stream":"stdout","tags":[]}

Example 2: input stream only applies "codec => line", output only applies "codec => line":
logstash.conf:

input { tcp { port => 5000 type => "backend" codec => line } } output { stdout { codec => line } }

Logstash output = 5 single lines that logically belong together (actually the format of the output is the same as in the 1st example, no changes here):

2018-09-17T11:43:39.593Z 192.168.0.3 {"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":" INFO  [20180917 13:43:39] - 19591 BlaTestMo.getData: executing SELECT profile.id, profile.name, profile.ctype, profile.ison, profile.isdef, profile.ts","stream":"stdout","tags":[]}

2018-09-17T11:43:39.593Z 192.168.0.3 {"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\tFROM profile profile0","stream":"stdout","tags":[]}

2018-09-17T11:43:39.593Z 192.168.0.3 {"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\tWHERE profile0.ctype_id = ?","stream":"stdout","tags":[]}

2018-09-17T11:43:39.593Z 192.168.0.3 {"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\t AND profile0.ison = ?","stream":"stdout","tags":[]}

2018-09-17T11:43:39.594Z 192.168.0.3 {"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\t AND profile0.id = ?","stream":"stdout","tags":[]}

[Part 2]
Example 3: Input uses multiline codec, output uses line codec -- no other formating done:
logstash.conf:

input 
{ 
  tcp 
  { 
    port => 5000 
    type => "backend" 

    codec => multiline
    {
      pattern => "^.*? %{LOGLEVEL} +\["
      negate => true
      what => "previous"
    }#multiline
  }#tcp
}#input

output { stdout { codec => line } }

Logstash output = 1 single line (which is proven by the fact that [timestamp] [source ip] exists only once) that includes the content of the 5 single lines of the previous examples (i.e. that the multiline codec has done what it is supposed to do -- it has merged the prior 5 single lines which logically are 1 log message into 1 single line):

2018-09-17T11:59:51.461Z 192.168.0.3 {"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":" INFO  [20180917 13:59:51] - 91392 BlaTestMo.getData: executing SELECT profile.id, profile.name, profile.ctype, profile.ison, profile.isdef, profile.ts","stream":"stdout","tags":[]}
{"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\tFROM profile profile0","stream":"stdout","tags":[]}
{"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\tWHERE profile0.ctype_id = ?","stream":"stdout","tags":[]}
{"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\t AND profile0.ison = ?","stream":"stdout","tags":[]}
{"Environment":"int","Instance":"Myapp-Core-abcd","docker":{"name":"/myapp","id":"c4088acde5e68431a7","image":"192.168.0.138:6000/myapp-abcd-int:latest","hostname":"c51e5eec41ad","labels":{"Environment":"int","Instance":"myapp-core-abcd","build-date":"20180517","license":"GPLv2","name":"CentOS Base Image","vendor":"CentOS"}},"message":"\t AND profile0.id = ?","stream":"stdout","tags":[]}

Example 4: Input uses line codec, output uses rubydebug codec -- no other formating done:
logstash.conf:

input { tcp { port => 5000 type => "backend" codec => line } } output { stdout { codec => rubydebug } }

Logstash output: 5 single lines

{
      "@version" => "1",
          "port" => 56474,
       "message" => "{\"Environment\":\"int\",\"Instance\":\"Myapp-Core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"a9fg6gh9dfe6\",\"image\":\"192.168.0.138:6000/myapp-abcd-int:latest\",\"hostname\":\"c51e5eec41ad\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myapp-core-abcd\",\"build-date\":\"20180517\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\" INFO  [20180917 15:01:44] - 04884 BlaTestMo.getData: executing SELECT profile0.id, profile0.name, profile0.ctype, profile0.ison, profile0.isdef, profile0.ts\",\"stream\":\"stdout\",\"tags\":[]}",
          "type" => "backend",
    "@timestamp" => 2018-09-17T13:01:44.887Z,
          "host" => "192.168.0.3"
}
{
      "@version" => "1",
          "port" => 56474,
       "message" => "{\"Environment\":\"int\",\"Instance\":\"Myapp-Core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"a9fg6gh9dfe6\",\"image\":\"192.168.0.138:6000/myapp-abcd-int:latest\",\"hostname\":\"c51e5eec41ad\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myapp-core-abcd\",\"build-date\":\"20180517\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\"\\tFROM profile profile0\",\"stream\":\"stdout\",\"tags\":[]}",
          "type" => "backend",
    "@timestamp" => 2018-09-17T13:01:44.887Z,
          "host" => "192.168.0.3"
}
{
      "@version" => "1",
          "port" => 56474,
       "message" => "{\"Environment\":\"int\",\"Instance\":\"Myapp-Core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"a9fg6gh9dfe6\",\"image\":\"192.168.0.138:6000/myapp-abcd-int:latest\",\"hostname\":\"c51e5eec41ad\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myapp-core-abcd\",\"build-date\":\"20180517\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\"\\tWHERE profile0.ctype_id = ?\",\"stream\":\"stdout\",\"tags\":[]}",
          "type" => "backend",
    "@timestamp" => 2018-09-17T13:01:44.887Z,
          "host" => "192.168.0.3"
}
{
      "@version" => "1",
          "port" => 56474,
       "message" => "{\"Environment\":\"int\",\"Instance\":\"Myapp-Core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"a9fg6gh9dfe6\",\"image\":\"192.168.0.138:6000/myapp-abcd-int:latest\",\"hostname\":\"c51e5eec41ad\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myapp-core-abcd\",\"build-date\":\"20180517\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\"\\t AND profile0.ison = ?\",\"stream\":\"stdout\",\"tags\":[]}",
          "type" => "backend",
    "@timestamp" => 2018-09-17T13:01:44.887Z,
          "host" => "192.168.0.3"
}
{
      "@version" => "1",
          "port" => 56474,
       "message" => "{\"Environment\":\"int\",\"Instance\":\"Myapp-Core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"a9fg6gh9dfe6\",\"image\":\"192.168.0.138:6000/myapp-abcd-int:latest\",\"hostname\":\"c51e5eec41ad\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myapp-core-abcd\",\"build-date\":\"20180517\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\"\\t AND profile0.id = ?\",\"stream\":\"stdout\",\"tags\":[]}",
          "type" => "backend",
    "@timestamp" => 2018-09-17T13:01:44.887Z,
          "host" => "192.168.0.3"
}

[Part 3 of 3]
Example 5: Input uses multline codec, output uses rubydebug codec -- no other formating done:
logstash.conf:

input 
{ 
  tcp 
  { 
    port => 5000 
    type => "backend" 

    codec => multiline
    {
      pattern => "^.*? %{LOGLEVEL} +\["
      negate => true
      what => "previous"
    }#multiline
  }#tcp
}#input

output { stdout { codec => rubydebug } }

Logstash output: 1 single line that contains the input of the 5 single lines of the previous examples:

{
    "@timestamp" => 2018-09-17T12:08:08.828Z,
          "port" => 46752,
          "host" => "192.168.0.3",
          "type" => "backend",
       "message" => "{\"Environment\":\"int\",\"Instance\":\"Myapp-Core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"a9fg6gh9dfe6\",\"image\":\"192.168.0.138:6000/myapp-abcd-int:latest\",\"hostname\":\"c51e5eec41ad\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myapp-core-abcd\",\"build-date\":\"20180517\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\" INFO  [20180917 14:08:08] - 88826 BlaTestMo.getData: executing SELECT profile0.id, profile0.name, profile0.ctype, profile0.ison, profile0.isdef, profile0.ts\",\"stream\":\"stdout\",\"tags\":[]}\n{\"Environment\":\"int\",\"Instance\":\"Myapp-Core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"a9fg6gh9dfe6\",\"image\":\"192.168.0.138:6000/myapp-abcd-int:latest\",\"hostname\":\"c51e5eec41ad\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myapp-core-abcd\",\"build-date\":\"20180517\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\"\\tFROM profile profile0\",\"stream\":\"stdout\",\"tags\":[]}\n{\"Environment\":\"int\",\"Instance\":\"Myapp-Core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"a9fg6gh9dfe6\",\"image\":\"192.168.0.138:6000/myapp-abcd-int:latest\",\"hostname\":\"c51e5eec41ad\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myapp-core-abcd\",\"build-date\":\"20180517\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\"\\tWHERE profile0.ctype_id = ?\",\"stream\":\"stdout\",\"tags\":[]}\n{\"Environment\":\"int\",\"Instance\":\"Myapp-Core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"a9fg6gh9dfe6\",\"image\":\"192.168.0.138:6000/myapp-abcd-int:latest\",\"hostname\":\"c51e5eec41ad\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myapp-core-abcd\",\"build-date\":\"20180517\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\"\\t AND profile0.ison = ?\",\"stream\":\"stdout\",\"tags\":[]}\n{\"Environment\":\"int\",\"Instance\":\"Myapp-Core-abcd\",\"docker\":{\"name\":\"/myapp\",\"id\":\"a9fg6gh9dfe6\",\"image\":\"192.168.0.138:6000/myapp-abcd-int:latest\",\"hostname\":\"c51e5eec41ad\",\"labels\":{\"Environment\":\"int\",\"Instance\":\"myapp-core-abcd\",\"build-date\":\"20180517\",\"license\":\"GPLv2\",\"name\":\"CentOS Base Image\",\"vendor\":\"CentOS\"}},\"message\":\"\\t AND profile0.id = ?\",\"stream\":\"stdout\",\"tags\":[]}",
          "tags" => [
        [0] "multiline"
    ],
      "@version" => "1"
}

My challenge is regarding this last example (unless there are better ways w/o multiline codec to achieve this) is: The "message" => "{ ... }" field contains redundant data from the 5 single lines which I do only need once. Plus, the actual complete log message that the Kibana user is interested in is contained in the other ""message":" sub-parts within the big "message => "{ ... }"" field. I somehow need to extract this content into one separate field (e.g. "actual_log_msg" => "...") and besides removing the redundant content of the other repeated sub-fields (e.g. "Instance":"myapp-core-abcd") I need to map the remaining, non-redundant content to separate fields (e.g. "actual_instance" => "myapp-core-abcd").

I hope that the above examples better illustrate the challenge that I have with single log lines in the input stream that logically must be merged into one log entry in elasticsearch/kibana. (The real single lines which logically are only a single line and must be ignored by the multiline codec are so far no problem to me since they are not matched by my multiline codec pattern.)

Thanks,
Kaspar

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.