Parse multi line json

laxmikanth · July 26, 2020, 11:45pm

Hi,

When I try to parse multi line json as below, I am seeing in ELK as multiline, _jsonparsefailure, _grokparsefailure

from below json file content, want to remove KEY4 & host and send rest of the fields to elastic search.

[
{
"KEY1": "ABC",
"KEY2": "ABC",
"KEY3": "ABC",
"KEY4": "{"region":"11","UserSessionId":"222","UserId":"gllexie"}",
"host": "ABC",
"timestamp": 1595411041516,
},
{
"KEY1": "ABC2",
"KEY2": "ABC2",
"KEY3": "ABC2",
"KEY4": "{"region":"22","UserSessionId":"No%20CPM%20Profile","UserId":"gllexie"}",
"host": "ABC2",
"timestamp": 1595411041516,
}
]

aaron_nimocks · July 27, 2020, 12:54am

Not sure what your current input/filter look like but it would be something like this.

    filter {
      mutate {
        remove_field => [ "host", " KEY4"]
      }
    }

laxmikanth · July 27, 2020, 1:42am

filter {
mutate {
remove_field => ["host", " KEY4"]
}
}

this filter worked, but I see JSON element is in message where as I wanted them to be separate fields in kibana. Can you please help me in resolving this?

aaron_nimocks · July 27, 2020, 1:59am

Try this. If it doesn't work can you post your full config?

    filter {
      json {
        source => "message"
        remove_field => [ "host", " KEY4"]
      }
    }

laxmikanth · July 27, 2020, 12:26pm

input {
file
{
codec => multiline
{
negate => true
what => previous
negate => true
pattern => "{*}"
}
path => ["C:/samplefile.json"]
start_position => "beginning"
sincedb_path => "C:/logs/dev"
}
}
filter {
mutate {
remove_field => ["host", " KEY4"]
}
}
output {
elasticsearch {
hosts => "localhost:9200"
index => "sample-ingest"
}
stdout {
codec => rubydebug
}
}

When I use above config, I see all the JSON is sent to kibana as message, where as I want each field in JSON as a separate property in kibana.

aaron_nimocks · July 27, 2020, 12:58pm

If you do the below does it work? Is there a reason you are using multiline?

input {
 file
 { 
   path => ["C:/samplefile.json"]
   start_position => "beginning"
   sincedb_path => "C:/logs/dev"
 }
}

laxmikanth · July 27, 2020, 4:01pm

Hi, I am using multiline as my JSON object is spreaded across multiple lines.

Fabio-sama · July 27, 2020, 4:09pm

Hi there,

first of all please make use of the code formatter tool ( ) when pasting non plain text (such as pipeline conf file lines) cause otherwise it'll be more difficult to read and go through.

Now, to my understanding you'd like to process that array of json of yours from a file and send each json as a separate document, each one with its fields extracted.

Now, first thing I suggest you should do is edit your source file (if possible) to have the whole json on a single line and be careful to leave a empty line at the end of the file. So your file should look something like this:

Also, please note that the one you posted is not a valid json. In fact, pasting it in any json validator, it'll highlight the useless commas after the "timestamp" values and the quotes around the nested KEY4 value. To be a valid json, yours should look something like this:

[
  {
  "KEY1": "ABC",
  "KEY2": "ABC",
  "KEY3": "ABC",
  "KEY4": {
    "region":"11",
    "UserSessionId":"222",
    "UserId":"gllexie"
  },
  "host": "ABC",
  "timestamp": 1595411041516
  },
{
  "KEY1": "ABC2",
  "KEY2": "ABC2",
  "KEY3": "ABC2",
  "KEY4": {
    "region":"22",
    "UserSessionId":"No%20CPM%20Profile",
    "UserId":"gllexie"
  },
  "host": "ABC2",
  "timestamp": 1595411041516
  }
]

Now, having said that, what you could do (after you managed to have your json shrinked in one line with a trailing empty line in the file) is a pipeline like the following:

input {
  file {
    path => "path/to/json/file"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    codec => "json"
  }
}

filter {
  mutate {
    remove_field => ["host", "KEY4"]
  }
}

output {
  stdout{}
}

Having your source json file on one line will avoid you that multiline and all the hassles that come with it.

Obviously you can replace the standard output with your ES instance.

laxmikanth · July 27, 2020, 5:00pm

Hi Fabio,

Thank you for your response, I need to process KEY4 as string as we have limit on the length of the value of KEY4 so in some objects, KEY4 is not a valid JSON (as it is trimmed once it reaches the limit of length).

so, what I am trying to do is remove the KEY4 or convert it as string.

Fabio-sama · July 27, 2020, 5:16pm

Ok but you need to pass it as a valid json. It means that if you want to treat it as a string, you have to escape the quotes inside the KEY4 value, like so:

{
  "KEY1": "ABC",
  "KEY2": "ABC",
  "KEY3": "ABC",
  "KEY4": "{ \"region\":\"11\", \"UserSessionId\":\"222\", \"UserId\":\"gllexie\" }",
  "host": "ABC",
  "timestamp": 1595411041516
}

Now this is a valid json and you can use that same approach.
Obviously you need to be able to either format the json file properly when writing it (so escaping the quotes in the KEY4 value and put the json as a single line) or to edit it before parsing it with logstash.

Can you do that? Otherwise I have to provide you with a slightly less intuitive solution.

laxmikanth · July 27, 2020, 5:54pm

Hi Fabio,

Yes, I can format the KEY4 value as { \"region\":\"11\", \"UserSessionId\":\"222\", \"UserId\":\"gllexie\" },

but for some elements it can be { \"region\":\"11\", \"UserSessionId\":\"222\", \"UserId\":\"gllexieaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\",

Fabio-sama · July 28, 2020, 9:57am

Ok so you're telling me some elements might have the value of that field excessively long and the string might not be closed properly with the trailing quotes?

If that's the case I have a feeling you need to first grok out the useless KEY4 part and then proceed with the JSON evaluation, otherwise the json won't be a valid one.

Also, in case of an excessively long KEY4 value, will you miss the remaining part of the json, too (like host and timestamp)?

Can you please post here an example of such a case?

Thank you

system · August 25, 2020, 10:00am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to parse the multiline json file through logstash Logstash	7	18337	July 6, 2017
Use multiline and grok for splitting complex json input Logstash	2	673	January 12, 2018
Json of varying length and multiline Logstash	17	1143	August 26, 2019
Multiline json data into logstash Logstash	1	216	May 18, 2023
How to parse the multiline and nested json file Logstash	2	695	March 17, 2022

Parse multi line json

Related topics