How to parse the multiline json file through logstash


I have a json file with below multiline json format:

"status": "fail",
"executiontime": 1117,
"errormsg": "dummy error1",
"testname": "test1",
"errorcode": 0,
"signalcode": 0
"status": "pass",
"executiontime": 1111,
"errormsg": "Dummy error2",
"testname": "test2",
"errorcode": 0,
"signalcode": 0
"status": "fail",
"executiontime": 1155,
"errormsg": "Dummy error3",
"testname": "test3",
"errorcode": 0,
"signalcode": 0

I am using grok pattern to fetch the fields and index them to elasticsearch.
My conf file looks something like below:

#An input plugin enables a specific source of events to be read by Logstash.
codec => multiline {
pattern => "^\s\s\s\s}"
negate => true
what => previous
max_lines => 20000
path => [path/to//abc.json"]
start_position => "beginning"
sincedb_path => "/dev/null"
type => "test"
ignore_older => 0

if [type] == "test"

        match => [
        'message' , '%{GREEDYDATA}"status": "%{GREEDYDATA:status}", \r\n\s+"executiontime": %{GREEDYDATA:exectime}, \r\n\s+"errormsg": "%{GREEDYDATA:error}", \r\n\s+"testname": "%{GREEDYDATA:testname}", \r\n\s+"errorcode": %{GREEDYDATA:errorcode}, \r\n\s+"signalcode": %{GREEDYDATA:signalcode}\r%{GREEDYDATA}'
	if "_jsonparsefailure" in [tags]
    if "_grokparsefailure" in [tags] 
        drop {}
            gsub => ["message", "\r\n", ""]
            remove_field => [  "message", "@version", "path",  "host", tags]
	code => "
	event['exectime'] = event['exectime'].to_i;
	event['signalcode'] = event['signalcode'].to_i;
	event['errorcode'] = event['errorcode'].to_i;


if [type] == "test"
codec => rubydebug



This works fine with the above pattern.
But the fields in the json may not be in the same order when generated.
For example: "errorcode", "signalcode" can appear at the top, testname can appear at the 3rd place as below:

"errorcode": 0,
"signalcode": 0,
"testname": "test1",
"status": "pass",
"executiontime": 1111,
"errormsg": "StaleElementReferenceException"

I this case the grok pattern which I am using in my config file above will not work.
Is there any way that I can handle the above condition?

Looking for help ASAP.

Look at the json filter

Initially I tried with json filter. But it did not work for me for multiline json.

How about splitting the field to multiple documents?

Or at least mutate split which you can then interact on each piece of the document

What i am looking is:

My result.json file has below content:

"program id" : "1",
"id" : "aaa",
"status" : "PASSED",
"PauseTime" : "0",
"testname" : "test1",
"last update" : "2016-09-16 20:11:56",
"start" : "2016-09-16 14:06:08",
"status id" : "2"
"program id" : "2",
"id" : "bbb",
"status" : "PASSED",
"PauseTime" : "0",
"last update" : "2016-09-16 20:13:32",
"start" : "2016-09-16 20:13:08",
"status id" : "2",
"testname" : "test2"

If you observe here: the ket, values are not in same order. In the first pattern, testname is the 5 field, but in the second, testname is the last field.

If I use grok pattern as shown in the above post, it will not work out as the key, value are not in a fixed place.

I want to index these data in a type in elasticsearch as 2 different documents with the above fields present in each document.something like:

type: testdata,
_id: 1,
_source: {"testname":"test1", "program id" : "1", "id" : "aaa","status" : "PASSED","last update" : "2016-09-16 20:11:56", "start" : "2016-09-16 20:13:08"}

_source: {"testname":"test2", "program id" : "2", "id" : "bbb","status" : "PASSED","last update" : "2016-09-16 20:13:32", "start" : "2016-09-16 20:13:08"}

How do I achieve this?

For solving the multiline problem you can use the multiline codec:

So what you need to do is to define the quotation marks as being a part of the same line (in the pattern field). What this codec would do is to place all fields in a single line and it will insert a new line only when facing characters at the beginning of the line that do not exist in the pattern field.
From your data I assume you want to brake the line only at the opening curley bracket { so I would put the quotation marks and the closing curley brackets in the pattern field. I believe you need to place the multiline codec before the json filter.

why dont use a grok filter to store both the field string and the value. And then store the value in elasticsearch depending on the string that precedes it. Something like:


if field1=="status" {
add field status with value1