Elastic JSON Processor Error: `cannot add non-map fields to root of document`

I am ingesting logs in the ECS format from the Elastic Serverless Forwarder into Elasticsearch. These logs are generated by the ECS Python logging library. Because they are being generated by ESF, I need to expand the ndjson events using an ingest pipeline instead of being able to have the agent do it.

However, when I ingest the logs, I get the following error message:

cannot add non-map fields to root of document

I do not receive the same error if I extract the logs to a target_field, however there is no easy way to magically move all logs from the target field to the root field, as not all fields are present in all events and some fields need to be merged (e.g., log.*).

Sample Log Entry

POST _ingest/pipeline/expand-json-events/_simulate
{
  "docs": [
    {
      "_index": "index",
      "_id": "a9247583-8b75-4fa4-b8e7-053500f2736e",
      "_source": {
        "@timestamp": "2023-08-15T14:12:01.486Z",
        "message": "{\"@timestamp\":\"2023-08-15T14:12:01.486Z\",\"log.level\":\"debug\",\"message\":\"My log message\",\"ecs\":{\"version\":\"1.6.0\"},\"log\":{\"logger\":\"logger_name\",\"origin\":{\"file\":{\"line\":123,\"name\":\"file.py\"},\"function\":\"function_name\"},\"original\":\"My log message\"},\"process\":{\"name\":\"MainProcess\",\"pid\":8,\"thread\":{\"id\":140241849009984,\"name\":\"MainThread\"}}}"
      }
    }
  ]
}

Elasticsearch Ingest Pipeline

{
  "description": "Expand JSON events",
  "processors": [
    {
      "rename": {
        "description": "Save original event",
        "field": "message",
        "target_field": "event.original"
      }
    },
    {
      "json": {
        "description": "Expand JSON message payload",
        "field": "event.original",
        "add_to_root": true,
        "add_to_root_conflict_strategy": "replace",
        "allow_duplicate_keys": false,
        "strict_json_parsing": true
      }
    }
  ],
  "on_failure": [
    {
      "set": {
        "field": "error.message",
        "value": "{{ _ingest.on_failure_message }}"
      }
    },
    {
      "set": {
        "field": "error.type",
        "value": "{{ _ingest.pipeline }}"
      }
    },
    {
      "set": {
        "field": "event.kind",
        "value": "pipeline_error"
      }
    }
  ]
}

@DougR

Interesting it works without the rename

PUT _ingest/pipeline/expand-json-events
{
  "description": "Expand JSON events",
  "processors": [
    {
      "json": {
        "description": "Expand JSON message payload",
        "field": "message",
        "allow_duplicate_keys": true,
        "add_to_root": true, 
        "strict_json_parsing": true,
        "add_to_root_conflict_strategy" : "replace"
      }
    }
  ],
  "on_failure": [
    {
      "set": {
        "field": "error.message",
        "value": "{{ _ingest.on_failure_message }}"
      }
    },
    {
      "set": {
        "field": "error.type",
        "value": "{{ _ingest.pipeline }}"
      }
    },
    {
      "set": {
        "field": "event.kind",
        "value": "pipeline_error"
      }
    }
  ]
}

Also you have "issue" lurking... log is an object then you have a "dotted" field log.level

Here is the _simulate of the above...

{
  "docs": [
    {
      "doc": {
        "_index": "index",
        "_id": "a9247583-8b75-4fa4-b8e7-053500f2736e",
        "_version": "-3",
        "_source": {
          "log.level": "debug",
          "process": {
            "name": "MainProcess",
            "pid": 8,
            "thread": {
              "name": "MainThread",
              "id": 140241849009984
            }
          },
          "@timestamp": "2023-08-15T14:12:01.486Z",
          "message": "My log message",
          "ecs": {
            "version": "1.6.0"
          },
          "log": {
            "original": "My log message",
            "logger": "logger_name",
            "origin": {
              "file": {
                "line": 123,
                "name": "file.py"
              },
              "function": "function_name"
            }
          }
        },
        "_ingest": {
          "timestamp": "2023-08-16T05:22:15.014074863Z"
        }
      }
    }
  ]
}

No, I am not sure why the rename is causing issues... at this time.

I never thought to check without the rename. Interesting. I'll try that. I wonder if it's trying to insert something into the event field and is choking on that. I do the rename out of habit, but with ECS I guess it's not necessary.

It looks like the problem is that the json processor does not handle fields with dots in their names correctly. On this line it is looking for a field named event.original, but its map of fields only contains event (which is pointing to a field named original). I think we need to fix JsonProcessor.

1 Like

And it looks like it might not be isolated to just the json processor: Support dotted field names in ingest processors · Issue #96648 · elastic/elasticsearch · GitHub.

1 Like

Per this, I tried the following, which worked as expected with no errors:

  • Rename message to __event_original.
  • Extract the JSON payload from __event_original.
  • Rename __event_original to event.original.

But again...since it's ECS, there's no real need to preserve the original event. Interesting, however, that the grok and date processors at least do not seem to have this issue.

Thx.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.