Parsing nested json

user-27022024 · February 17, 2025, 12:59pm

I am having issues with our AWS ecs -> fluentbit -> elasticsearch set up, specifically around nested json.

For example, if the log message is:

{
  "endpoint": "/process",
  "payload": {
     "body": {
       "success": "true",
       "items": [
          {"name": "item_one"},
          {"name": "item_two"}
       ]   
     }
  }
}

We would like the following fields to be parsed:

endpoint -> "process"
payload -> {"body": {"success": "true", "items": {"name": "item_one"}, {"name": "item_two"}]}

Only the top level key.

We set "index.mapping.depth.limit": 1 but this resulted in the logs being rejected by elasticsearch

    "status":400,
    "error":{
        "type": "illegal_argument_exception",
        "reason": "Limit of mapping depth [1] has been exceeded due to object field [org]"
    }

Is there a setting that will parse only the top level but accept the rest of the data as the body?

grumo35 · February 17, 2025, 1:30pm

Hey,

I'll try my best but you might want to wait for better answers

I do believe it's a mapping issue, here elestic is rejecting because the setting is not defined ?

You could also try to set the fields a keyword but it would be a static per field definition which can be tedious in case you have mixed data with a lot of fields.

Also there is a lot of ruby scripts around here if you want to extract the subkeys.

user-27022024 · February 17, 2025, 2:35pm

I appreciate the help!

So you are saying, I should try get the parsing done correctly on the fluentbit level and not elasticsearch level?

grumo35 · February 17, 2025, 3:27pm

No you'll have to define a static type for the incoming field on the elastic side so that the indexed field values are of type text Field data types | Elasticsearch Guide [8.17] | Elastic

user-27022024 · February 18, 2025, 1:53pm

Got it thanks.

We were hoping of not having to maintain the mappings on the elastic level but perhaps we'll have to.

Would it be possible to parse it at the fluentbit level so that the second level is a string instead of a object? Just trying to work out if there is a best practise for us to follow

user-27022024 · February 20, 2025, 2:20pm

In the end we sorted it out at the pipeline level

PUT /_ingest/pipeline/main_pipeline
{
  
  "description" : "process-pipeline ",
  "processors" : [
      {
        "date" : {
          "field" : "timestamp",
          "formats" : ["ISO8601"],
          "ignore_failure" : true
        }
      }, 
      {
      "script": {
        "source": """
          for (entry in ctx.entrySet()) {
            if (entry.getValue() instanceof Map) {
              ctx[entry.getKey()] = entry.getValue().toString();
            }
          }
        """
      }
    }
  ]
}

Not sure if it's best practise but it worked

grumo35 · February 20, 2025, 2:55pm

Yes ! That's the script i was thinking about you solved that from the parsing side.

I suggest you also check data types of your mapping ( which i believe is automatic ) So that you understand the type must match and can be conflicting or sometimes prevent the document from being ingested correctly by elastic.

Topic		Replies	Views
Indexing error Elasticsearch	5	270	July 6, 2017
Want to parse a nested json form filebeat to elasticsearch Beats filebeat	1	532	July 25, 2019
Unable to push nested json from filebeat to elasticsearch Elasticsearch beats-module	2	2074	September 11, 2019
Create mapping for nested json Elasticsearch	7	4977	July 6, 2017
Nested array in json Logstash	7	2336	July 6, 2017

Parsing nested json

Related topics