Ingest Pipeline for otel logs

I've recently migrated our deployments for EDOT to use the Open-Telemetry operator to bring them in line with the guidelines from Elastic.

Something that seems to be missing from this is parsing of structured logs (in this case ndjson).

I added in a ingest pipeline (logs@custom) that parses the message (or should that be body.text, this is called by the default logs pipeline (logs@default-pipeline) .

I could see that the messages were being parse as expected, however I wasn't able to search in kibana using them. For example if the there was a keyword field of http_path then I use a KQL of http_path : /api/* I would get no results found, even though I can see this field in the documents.

Am I doing anything incorrect here?

Hello Steve, Are you able to find resolution for this issue. Please guide. we are also facing the issue.. the field type showing as unknow that’s why it is not able to get in KQL.

HI,

I don’t recall having to do anything specific to address this issue.

I’ve been through the ingest pipelines in our deployment and can’t see that we are doing anything custom around json parsing (there are some pipelines that do specifically parse json messages but they are not referenced in the logs@default-pipeline )

I suspect that we were doing something wrong, I backed out all of our custom pipelines and then it started working.

So @Steve_Foster and @Muralikrishna_A

The problem with OTEL based logs and Elastic Ingest Pipelines is that the important JSON is flattened / dotted field names under resource.attributes and Elastic Ingest Pipelines do not work with "dotted" field names (instead of proper objects)

Example

   "data_stream": {
      "dataset": "generic.otel",
      "namespace": "default",
      "type": "logs"
    },
    "observed_timestamp": "2025-08-29T02:59:44.356393284Z",
    "resource": {
      "attributes": {
        "cloud.account.id": "elastic-sa", <<< DOTTED FIELD NAMES 
        "cloud.instance.id": "7222568281050400394", 
        "cloud.platform": "gcp_kubernetes_engine",
        "cloud.provider": "gcp",
        "cloud.region": "us-west2",
        "deployment.environment": "production",
        "host.arch": "amd64",
        "host.cpu.cache.l2.size": 56320,
        "host.cpu.family": "6",
        "host.cpu.model.id": "79",
        "host.cpu.model.name": "Intel(R) Xeon(R) CPU @ 2.20GHz",
        "host.cpu.stepping": "0",
        "host.cpu.vendor.id": "GenuineIntel",
        "host.id": "7222568281050400394",
  .....
        "host.name": "gke-stephen-brown-gke-de-default-pool-432f31cc-9l5k",
        "k8s.cluster.name": "stephen-brown-gke-dev-cluster",
        "k8s.container.name": "cart",
        "k8s.container.restart_count": "0",
        "k8s.deployment.name": "cart",
        "k8s.namespace.name": "default",
        "k8s.node.name": "gke-stephen-brown-gke-de-default-pool-432f31cc-9l5k",
        "k8s.pod.ip": "10.116.0.5",
        "k8s.pod.name": "cart-5765f55cdc-574ls",
        "k8s.pod.start_time": "2025-08-27T12:22:38Z",
        "k8s.pod.uid": "08b38e9b-06c2-41b1-8fe0-3a1dfdae9cb4",
        "k8s.replicaset.name": "cart-5765f55cdc",
        "os.description": "Red Hat Enterprise Linux 9.6 (Plow) (Linux gke-stephen-brown-gke-de-default-pool-432f31cc-9l5k 6.6.97+ #1 SMP PREEMPT_DYNAMIC Sun Jul 27 08:50:12 UTC 2025 x86_64)",
        "os.type": "linux",
        "service.name": "cart"
      },
      "schema_url": "https://opentelemetry.io/schemas/1.6.1"
    }
  }

Dot expander processor

Expands a field with dots into an object field. This processor allows fields with dots in the name to be accessible by other processors in the pipeline. Otherwise these fields can’t be accessed by any processor.

So IF you want to do something you are going to need to "Expand" those dots then you can work with them..... Perhaps At some point in the future Elastic Ingest PIpelines will work natively with this Style JSON, Today it does not

So you will need to build something like this....


  {
    "set": {
      "field": "custom_pipeline",
      "value": "traces-otel@custom"
    }
  },
  {
    "set": {
      "field": "resource_exp",
      "copy_from": "resource"
    }
  },
  {
    "dot_expander": {
      "field": "*",
      "path": "resource_exp.attributes"
    }
  }
]

Keep in mind that you are "moving" away from OTEL Semantic convention and OTEL best practices, which will be processed in the Collector using OTTL or Extensions, not at the Sink... BUT that said, Many things in OTEL are still moving so we shall see... :slight_smile:

1 Like

Hi All, We are using the EDOT in our k8s env and for parsing the otel logs we enabled the pipeline in the exporters for parsing the json logs. we used the below pipeline and body.text is containing the actual log message of the application so in the json processor we used the field “body.text” and target as “msg” and we splitted fields like msg.status, msg.level, etc but those field type is coming as “unknown field”. Then we gave the target field as “attributes” then we got attributes.level, attributes.status (then we are getting the proper field types) but inaddition to that we are getting “level”, “status” etc so these are again the duplicate fields. so can anyone help us to resolve this issue.

{ "json": { "ignore_failure": true, "field": "body.text", "target_field": "attributes", "if": "ctx.should_json == true" } }

the duplication of attributes.status and status are due to a mapping in the index template… I’ve not found them to be a problem except for the visual duplication

Parsing body.text should work pretty normally... And yes, you have to look at the template to see which fields have been aliased... as Elastic is currently navigating the transition between ECS (which was adopted by OTEL) and the Full OTEL Semantic Convention.