Further improvements for log_ecs_reformatting

APM Agent language and version: Java 1.24.0

Browser version: Chrome 91

Hello all,

I am still testing the Java APM agent with log_ecs_reformatting enabled and found the following problems:

  1. APM agent generates (for ElasticSearch) invalid JSON objects when adding an ECS field to the MDC which is also generated by the Agent
  • use the following code to generate a log
MDC.put("event.dataset", "security");
log.info("test");
  • this is the JSON message
{
    "@timestamp": "2021-06-23T08:53:45.345Z",
    "log.level": "INFO",
    "message": "test",
    "ecs.version": "1.2.0",
    "service.name": "tomcat-application",
    "event.dataset": "tomcat-application.[redacted]",
    "process.thread.name": "http-nio-2000-exec-8",
    "log.logger": "[redacted]",
    "transaction.id": "d05993a768800728",
    "trace.id": "4b50419e4c42971617afc8cdfbe179e2",
    "event.dataset": "security"
}
  • this is the ingestion response
{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse",
    "caused_by" : {
      "type" : "json_parse_exception",
      "reason" : "Duplicate field 'event.dataset'\n at [Source: (byte[])...[truncated 163 bytes]; line: 11, column: 20]"
    }
  },
  "status" : 400
}
  • proposal: do not store the MDC fields in the root and create an MDC subelement like this
{
    "@timestamp": "2021-06-23T08:53:45.345Z",
    "log.level": "INFO",
    "message": "test",
    "ecs.version": "1.2.0",
    "service.name": "tomcat-application",
    "event.dataset": "tomcat-application.[redacted]",
    "process.thread.name": "http-nio-2000-exec-8",
    "log.logger": "[redacted]",
    "transaction.id": "d05993a768800728",
    "trace.id": "4b50419e4c42971617afc8cdfbe179e2",
   "mdc": {
        "event.dataset": "security"
   }
}
  1. Although the co.elastic.logging.logback.EcsEncoder supports storing the markers the APM agent does not store them as co.elastic.apm.agent.logback.LogbackEcsReformattingHelper always sets ecsEncoder.setIncludeMarkers(false); in createEcsFormatter
  • use the following code to create a log with a marker
log.info(MarkerFactory.getMarker("CONFIDENTIAL"), "test");
  • use the following code to create a log with multiple markers
Marker dupMarker = MarkerFactory.getDetachedMarker("COMPOSITE_MARKER");
dupMarker.add(MarkerFactory.getMarker("SECURITY"));
dupMarker.add(MarkerFactory.getMarker("CONFIDENTIAL"));
log.info(dupMarker, "test");
  • the markers should be shown in the response
  • proposal: Either store the markers always or add a setting with which this feature can be disabled (I would expect the default to be enabled).

Best regards
Wolfram

How are you sending and parsing your logs?

We recommend using Filebeat's expand_keys option that turns flat values, such as "event.dataset" into a nested object. See also Get started | ECS Logging Java Reference [1.x] | Elastic.

However, I think expand_keys suffers from the same issue and fails on conflicts, as opposed to giving precedence to the user-supplied value. There's an open issue about that: [Filebeat] json.expand_keys conflict behaviour should be configurable · Issue #23135 · elastic/beats · GitHub

We're currently thinking about how to enable similar things with ingest pipelines.

While this helps to avoid conflicts, it wouldn't be possible to set top-level fields via MDC which I think is important.

That sounds like a better default, agree. Are you fine with how markers are serialized as tags, such as tags: ["SECURITY", "CONFIDENTIAL"]?

Hello Felix,

We recommend using Filebeat's expand_keys option that turns flat values

Currently, I am ingesting it "as is" with Filebeat into our 7.12.1 test cluster. I will check what Filebeat does with expand_keys and report back.

While this helps to avoid conflicts, it wouldn't be possible to set top-level fields via MDC which I think is important.

I am a bit conflicted about this: On one side, what you say makes sense as some users might want to have those fields in root (what may still be possible with an ingest pipeline). On the other side, "giving precedence to the user-supplied value" sounds like dropping information to me. Currently, I get an error that shows that 2 keys with the same name exist - afterwards I loose either the original value or the user-supplied value.
Of course, the issue you mentioned talks about the possibility of renaming the existing key if another key with the same name was found but this makes it harder to process:

  • if the both agent and application stores a value under event.dataset: event.dataset contains the application supplied value while event.dataset.value contains the agents value
  • if only the agent stores a value under event.dataset: event.dataset contains the agents value
  • if only the application stores a value under event.dataset: event.dataset contains the application supplied value

This way, it is hard to recognise which value came from which source. Maybe my solution is not good but I tried to solve exactly this problem:

  • if the both agent and application stores a value under event.dataset: event.dataset contains the agents value while mdc.event.dataset contains the application value
  • if only the agent stores a value under event.dataset: event.dataset contains the agents value
  • if only the application stores a value under event.dataset: mdc.event.dataset contains the application supplied value

This way, the application and agent supplied values are always separated.

edit: A usecase here is that our event.dataset contains a categorization of the log content(e.g. security) and the agents event.dataset contains the log appender which we need for routing to the correct index.

Are you fine with how markers are serialized as tags

Yes, this is fine by me.

I think there should be just one event.dataset and not two slightly differently named event.dataset/mdc.event.dataset fields. There's even a third layer which is Filebeat that, depending on version and configuration also adds event.dataset: "generic" by default.

Given the three layers of input (Filebeat, ECS logging default value (may be set by APM Agent), MDC), a single value should be negotiated. This means different levels of precedence/priority. The closer the value is configured to the event itself, the higher the precedence.

If you don't want the MDC values to override the defaults set in the ECS logging configuration, I'd argue you should not set MDC.put("event.dataset", "security");. Instead, use another key for the category, such as event.category. Note that this field restricts the allowed values.

What's arguably a bug in ecs-logging-java is that setting a custom value for event.dataset doesn't either remove the default event.dataset or doesn't disallow setting the field as a MDC value.

I just tested the expand_keys and it works. :+1:

Yes, maybe I am thinking to complicated. It is just that MDC.put("event.dataset", "security"); is currently read from the original logback appender which is read by pipelines that I hoped could stay unaffected by our testing(even if it is "only" our test system). As log_ecs_reformatting is still marked as experimental I hoped I could use the existing pipelines as they are while testing the log_ecs_reformatting in parallel. It seems I have to overcome my hesitation...

I've created a PR in ES to allow for duplicate keys: Allow duplicate keys in json processor by felixbarny · Pull Request #74602 · elastic/elasticsearch · GitHub

@Wolfram_Haussig I created a PR with some enhancements to the log_ecs_reformatting feature, including changing the default setting for includeMarkers to true. You are welcome to try it out.

1 Like

@Eyal_Koren I can confirm that service.node.name, service.environment(using log_ecs_reformatting_additional_fields) and the markers work:

{"@timestamp":"2021-07-20T04:25:34.754Z", "log.level": "INFO", "message":"Observer started", "ecs.version": "1.2.0","tags":["IMPORTANT"],"service.name":"tomcat-application","service.node.name":"TOMCAT-PDL-Q","event.dataset":"tomcat-application.OtcAppenderDev","process.thread.name":"observer","log.logger":"com.siemens.make.mfp.otcweb.runner.impl.ObserverImpl","service.environment":"quality"}
1 Like