Facing Issue while parsing nested JSON

madhavi_pdb · March 8, 2025, 4:23pm

Hello Community, I have been facing parsing issue for past 24 hrs,
I could not understand if log stash does not allow JSON parsing, adding new fields without GROK, KV ,Flattening

SCENARIO:
INPUT LOG:

{
  "accountname": "SuperUser",
  "Records": [
    {
      "requestParameters": {
        "sourceIPAddress": "10.12.18.191"
      },
      "eventTime": "2025-03-07T20:30:53.452Z"
    }
  ]
}

LOGSTASH CONFIG:

input {
  file {
    path => "/usr/share/logstash/config/sample.log"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {
  # Parse the incoming JSON
  json {
    source => "message"
    target => "logInfo"
  }

  # Create a copy of the 'accountname' field
  mutate {
    add_field => { "copy_of_accountname" => "%{[logInfo][accountname]}" }
  }
  mutate {
        remove_field => [ "logInfo" ]
        remove_field => [ "message" ]
        }
}

output {
  stdout { codec => rubydebug }
}

OUPUT:

{
             "@timestamp" => 2025-03-08T16:11:12.068109193Z,
                   "host" => {
        "name" => "fad5137b05ea"
    },
               "@version" => "1",
                   "tags" => [
        [0] "_jsonparsefailure"
    ],
                  "event" => {
        "original" => "}"
    },
                    "log" => {
        "file" => {
            "path" => "/usr/share/logstash/config/sample.log"
        }
    },
    "copy_of_accountname" => "%{[logInfo][accountname]}"
}

ecosystem : docker logstash container
version: 8.17.0

Expectation:
Should be able to extract accountname field
append cn_accountname => accountname

Please provide tools which help with parsing assistance ,other than ChatGPT

Thanks

stephenb · March 8, 2025, 5:18pm

Hi @madhavi_pdb Welcome to the community.

The problem is most likely is if your log file is actually json pretty meaning how you showed it and not NDJSON single line. That's what logstash is expecting it's line oriented. Can you convert your log file to ndjson?

You could test it pretty easily by passing your log through

cat sample.log | jq -c > sample.ndjson

Then try your pipeline...

Badger · March 8, 2025, 6:16pm

By default a file input consumes a file one line at a time. which makes sense of line-oriented logs. If you have multi-line JSON objects in the file then reformating (as @stephenb suggested) is a good option. If that's not possible then you may be able to use a multiline codec on the file input to read each object as an event. It depends how the objects are formatted.

For the file

 { "foo":
     { "bar": 1 }
 }
 { "a": 2 }
 { "b":
     { "c": 3 }
 }

we can see that every time a line starts with { it is the start of a new object. So we append lines to the object until we see another line starting with {. That can be done using

    codec => multiline {
        pattern => "^{"
        negate => true
        what => previous
        auto_flush_interval => 5
    }

Sometimes we can roll-up lines until we find the } at the start of a line that ends it. Thus

{
     "foo": {
        "bar": 1
     }
}
{
    "a": 2
}
{ "b":
    { "c": 3 }
}

can also be read using

    codec => multiline {
        pattern => "^}"
        negate => true
        what => next
        auto_flush_interval => 5
    }

If you have a non-indented format like

{ "foo":
{ "bar": 1
}
}
{ "a": 2 }

then a multiline code will not help you. You will need a program that can maintain state (a count of the { and } seen) in order to decide whether it has reached the end of an object. There is no logstash codec or input that can do that.

madhavi_pdb · March 9, 2025, 10:04am

Thanks, Yes i got it @stephenb

After I converted , it started working.But my streaming logs dont come formatted.So do we have to convert each log line on logtsash properly before parsing and adding fields?

How are similar Big data problems solved?

madhavi_pdb · March 9, 2025, 10:06am

Sure got it Thanks for the detailing @Badger

leandrojmp · March 9, 2025, 1:44pm

Normally they are solved in the source, a pretty printed json is bad for automation cases, so it is going to be consumed by machines, you just do not use it.

Depending on the tool this is pretty easy to do.

madhavi_pdb · March 10, 2025, 2:53pm

Sure Got it , Thanks all for the responses.I was able to parse once i formatted to proper json.

Topic		Replies	Views
How to parse nested logs with JSON through logstash Logstash	5	3401	August 19, 2020
Nested Json parse failure in logstash? Logstash	9	1458	September 12, 2019
Unable to parse nested json using logstash json filter Logstash	6	1243	October 16, 2019
How to parse docker logs (nested serialized JSON) Logstash	2	2815	June 14, 2017
Parse Logstash.log with logstash Logstash	2	727	July 6, 2017

Facing Issue while parsing nested JSON

Related topics