New data stream backing index dropping logs

I currently have microservice logs going from kubernetes -> confluent kafka -> elastic via sink connector (I have confirmed that they are reaching kafka but not elastic). On elastic I have a corresponding data stream that has a new backing index every month based on the lifecycle policy I'm using I believe. The data stream uses the built-in "logs" index template.

On creation of the most recent two backing indices (august and september), I've noticed that logs from several microservices are no longer showing up in elastic, whereas a few whose logs were previously not showing up now are. I looked back on backing indices >2 months ago and noticed that the logs that were showing up and no longer are seemed to have multiple conflicting fields, while no logs in the most recent 2 backing indices have any. May or may not be related.

The logs being processed do not have a uniform structure and it would not be practical to make them uniform. Correct me if I'm wrong, but this looks like a case of the new backing index's mapping being created based on a log format that, by happenstance, was the first to be processed and to which only some of the logs being processed conform, and so non-conforming logs are being dropped.

After some research it looks like an index-wide ignore_malformed setting might resolve the issue even if it's not the ideal fix. I've tried updating this setting in the portal on both the built-in logs index template and the constituent 3 component templates individually ("logs-mappings", "data-streams-mappings", "logs-settings"), but get:

composable template [logs] template after composition with component templates [logs-mappings, data-streams-mappings, logs-settings] is invalid

and

updating component template [logs-mappings] results in invalid composable template [logs] after templates are merged

respectively.

With this context, my questions are:
Is my assessment of the issue here correct?
Is it possible to change index settings across a data stream and all of its backing indices, specifically ignore_malformed?
Is there a better way to handle non-homogeneous data formats on a data stream that prevents logs from being dropped?

If things are being rejected then you should have logs that tell you what the issue is, are you able to share them?

Ah, sort of forgot I could check the connector dead letter queue. Based on that, it seems like my assumption was correct.
The messages for microservices with logs not showing up all have error headers similar to this:

[
  {
    "key": "__connect.errors.topic",
    "stringValue": "application.logs"
  },
  {
    "key": "__connect.errors.partition",
    "stringValue": "0"
  },
  {
    "key": "__connect.errors.offset",
    "stringValue": "1867606991"
  },
  {
    "key": "__connect.errors.connector.name",
    "stringValue": "lcc-00zz6"
  },
  {
    "key": "__connect.errors.task.id",
    "stringValue": "0"
  },
  {
    "key": "__connect.errors.stage",
    "stringValue": "TASK_PUT"
  },
  {
    "key": "__connect.errors.class.name",
    "stringValue": "org.apache.kafka.connect.sink.SinkTask"
  },
  {
    "key": "__connect.errors.exception.class.name",
    "stringValue": "io.confluent.connect.elasticsearch.ElasticsearchClient$ReportingException"
  },
  {
    "key": "__connect.errors.exception.message",
    "stringValue": "Indexing failed: ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=object mapping for [kubernetes_labels_app] tried to parse field [kubernetes_labels_app] as object, but found a concrete value]]"
  },
  {
    "key": "__connect.errors.exception.stacktrace",
    "stringValue": "io.confluent.connect.elasticsearch.ElasticsearchClient$ReportingException: Indexing failed: ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=object mapping for [kubernetes_labels_app] tried to parse field [kubernetes_labels_app] as object, but found a concrete value]]\n"
  }
]

Which makes sense since the rejected logs have text for this field while the accepted ones have an object with two fields. I'm going to figure out why that is occurring since I wouldn't expect them to be different types on different logs, but I think my other original questions still stand.

This is an issue... there can only be one type a concrete type such as a text fields or an object depending on the mapping whichever is not the same as the mapping will be rejected...
In this case looks like your mapping is an object and is rejecting the logs with the text type...

You will most likely need to '

  • Fix that upstream
  • Create an ingest pipeline to detect that case and put the text field into the object field

I understand that, and I can fix it upstream. I just wanted to check if there was a better way to handle this if it comes up again if I continually need to deal with heterogeneous data. It sounds like one way is to create an ingest pipeline like you mentioned. Any other advice you would give?

No not really.....

Except Slowly drive your customers toward some form of a common schema

Thus is the life of heterogenous log aggregation...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.