Add_kubernetes_metadata + rename = disappearing field?

Hi,

I have a very strange issue I can't pinpoint.

add_kubernetes_metadata works - I see the fields
rename works - I can rename fields

BUT, I can't rename the kubernetes fields. I added the -d * switch to filebeat to debug the issue, and this is the error:

2018-07-16T13:52:55.629Z DEBUG [rename] actions/rename.go:64 Failed to rename fields, revert to old event: could not fetch value for key: metadata.container, Error: key not found
2018-07-16T13:52:55.629Z DEBUG [filter] pipeline/processor.go:157 fail to apply processor client{add_docker_metadata=[match_fields=[] match_pids=[process.pid, process.ppid]], add_kubernetes_metadata, rename=[{From:kubernetes.pod.name To:metadata.container}]}: could not fetch value for key: metadata.container, Error: key not found

Here's the full example:

I add docker and kubernetes metadata, I see the fields both in filebeat's console and in Kibana:

processors:

    - add_docker_metadata: ~

    - add_kubernetes_metadata:
        in_cluster: true

In particular, I see the kubernetes.pod.name field. Now I just add rename:

processors:

    - add_docker_metadata: ~

    - add_kubernetes_metadata:
        in_cluster: true

    - rename:
        fields:
          - from: "kubernetes.pod.name"
            to: "metadata.container"

And it doesn't work. The errors, as written above:

2018-07-16T14:11:54.026Z DEBUG [rename] actions/rename.go:64 Failed to rename fields, revert to old event: could not fetch value for key: metadata.container, Error: key not found
2018-07-16T14:11:54.026Z DEBUG [filter] pipeline/processor.go:157 fail to apply processor client{add_docker_metadata=[match_fields=[] match_pids=[process.pid, process.ppid]], add_kubernetes_metadata, rename=[{From:kubernetes.pod.name To:metadata.container}]}: could not fetch value for key: metadata.container, Error: key not found

And in the published event, "kubernetes.pod" is completely emptied out, as if it's deleted.

Notice it says "could not fetch value for key: metadata.container" but why would it want to find the target key?

I went over rename.go (https://github.com/ruflin/beats/blob/af065c8bfd7931a7f3ba946360946a891ac26cbd/libbeat/processors/actions/rename.go) and found at least one bug:

In line 86,

should be:

return fmt.Errorf("could not fetch value for key: %s, Error: %s", from, err)

instead of:

return fmt.Errorf("could not fetch value for key: %s, Error: %s", to, err)

as we are searching for the source, not the target, but that's just a bug in the error message.

I still don't understand why would fields.GetValue (line 80) would return an error if the key exists. It's like somehow the renaming is causing a deletion or something.

Any ideas??? A workaround would be greatly appreciated!

Thanks in advance,

Tal

Could you share your full config file? Are you using auto discovery?

Note: I opened a PR for the rename log message here: https://github.com/elastic/beats/pull/7614 Thanks for the pointer.

Sure, this is the minimal config which creates the problem. I omitted the output part as I don't think it's relevant. All you need to do is have an "env" label on your kubernetes pod, and then you will have a kubernetes.labels.env as a field, the rename doesn't work. I DID discover that renaming kubernetes.labels completely DOES work. I believe the bug has something to do with renaming a scalar, or possibly a scalar with a specific type, or if it's the last scalar in the object, or renaming two scalar on the same object, something like that. Anyway, the config:

filebeat.inputs:
  - type: log
    paths:
      - /var/lib/docker/containers/*/*.log

    close_removed: true
    clean_removed: true

    json.keys_under_root: true

    processors:

      - add_kubernetes_metadata:
          in_cluster: true

      - rename:
          fields:
            - from: "kubernetes.labels.env"
              to: "metadata.env"

          ignore_missing: true
          fail_on_error: false

I don't use autodiscover. I'm not sure what it does actually, as everything is pretty much getting discovered correctly even if I don't use it..

If you need any more info, or for me to try something, do tell. For now, I'm renaming kubernetes.labels completely.

BTW, the issue is reproduced with 6.3.0 & 6.3.1.

I'm still investigating this but one thing I spotted: You have in the config you posted fail_on_error: false but the log error message above would indicate it's set to true. Did you modify this value?

Assuming an error would happen on deletion or put it fail_on_error is set to false it could explain the missing data but the above indicates it already fails when getting the value.

I'll keep you posted.

Any chance you could share an output example of the event that worked (with renaming the labels) and one of the events where it didn't work with the initial config?

Sure, here you go.

You are right, I miscopied the configuration in my last post. I removed the fail_on_error when investigating (to see on what it fails), and I accidentally pasted it in when posting the config.

Config which fails - renaming a value, I changed the value to one which is present on all containers, as it's part of the basic kubernetes metadata, you can see that the entire kubernetes.pod got deleted even though the log suggests it got failed before reaching the deletion part.

filebeat.inputs:
  - type: log
    paths:
      - /var/lib/docker/containers/*/*.log

    close_removed: true
    clean_removed: true

    json.keys_under_root: true

    processors:

      - add_kubernetes_metadata:
          in_cluster: true

      - rename:
          fields:
            - from: "kubernetes.pod.name"
              to: "metadata.container"

Log of failure:

2018-07-19T17:39:06.709Z	DEBUG	[kubernetes]	add_kubernetes_metadata/matchers.go:57	Incoming source value: /var/lib/docker/containers/0b4d8d8db50a7c35878305ef2979a80c541a0af8afb5e777fcd494f5d34e1424/0b4d8d8db50a7c35878305ef2979a80c541a0af8afb5e777fcd494f5d34e1424-json.log
2018-07-19T17:39:06.709Z	DEBUG	[kubernetes]	add_kubernetes_metadata/matchers.go:80	Using container id: 0b4d8d8db50a7c35878305ef2979a80c541a0af8afb5e777fcd494f5d34e1424
2018-07-19T17:39:06.709Z	DEBUG	[rename]	actions/rename.go:64	Failed to rename fields, revert to old event: could not fetch value for key: metadata.container, Error: key not found
2018-07-19T17:39:06.709Z	DEBUG	[filter]	pipeline/processor.go:157	fail to apply processor client{add_kubernetes_metadata, rename=[{From:kubernetes.pod.name To:metadata.container}]}: could not fetch value for key: metadata.container, Error: key not found
2018-07-19T17:39:06.709Z	DEBUG	[publish]	pipeline/processor.go:291	Publish event: {
  "@timestamp": "2018-07-19T17:39:06.709Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.3.1"
  },
  "stream": "stderr",
  "input": {
    "type": "log"
  },
  "host": {
    "name": "filebeat-l97k5"
  },
  "offset": 212831,
  "log": "<REDACTED>",
  "time": "2018-07-19T17:39:00.542287868Z",
  "prospector": {
    "type": "log"
  },
  "beat": {
    "hostname": "filebeat-l97k5",
    "version": "6.3.1",
    "name": "filebeat-l97k5"
  },
  "kubernetes": {
    "container": {
      "name": "filebeat"
    },
    "pod": {},
    "node": {
      "name": "<REDACTED"
    },
    "namespace": "default",
    "labels": {
      "k8s-app": "filebeat",
      "kubernetes": {
        "io/cluster-service": "true"
      },
      "pod-template-generation": "9",
      "controller-revision-hash": "650375567"
    }
  },
  "source": "/var/lib/docker/containers/0b4d8d8db50a7c35878305ef2979a80c541a0af8afb5e777fcd494f5d34e1424/0b4d8d8db50a7c35878305ef2979a80c541a0af8afb5e777fcd494f5d34e1424-json.log"
}

Working configuration (the only change is renaming an object instead of the value within):

filebeat.inputs:
  - type: log
    paths:
      - /var/lib/docker/containers/*/*.log

    close_removed: true
    clean_removed: true

    json.keys_under_root: true

    processors:

      - add_kubernetes_metadata:
          in_cluster: true

      - rename:
          fields:
            - from: "kubernetes.pod"   <--changed from "kubernetes.pod.name"
              to: "metadata.container"

And now it succeeds, notice that now metadata.container.name exists as the entire "kubernetes.pod" object got renamed successfully:

2018-07-19T17:42:37.988Z	DEBUG	[kubernetes]	add_kubernetes_metadata/matchers.go:57	Incoming source value: /var/lib/docker/containers/6d70a3e01e1568783f5730d7be17fe4274a28c05b1caf14303da00c4aaba8224/6d70a3e01e1568783f5730d7be17fe4274a28c05b1caf14303da00c4aaba8224-json.log
2018-07-19T17:42:37.988Z	DEBUG	[kubernetes]	add_kubernetes_metadata/matchers.go:80	Using container id: 6d70a3e01e1568783f5730d7be17fe4274a28c05b1caf14303da00c4aaba8224
2018-07-19T17:42:37.988Z	DEBUG	[publish]	pipeline/processor.go:291	Publish event: {
  "@timestamp": "2018-07-19T17:42:37.988Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.3.1"
  },
  "source": "/var/lib/docker/containers/6d70a3e01e1568783f5730d7be17fe4274a28c05b1caf14303da00c4aaba8224/6d70a3e01e1568783f5730d7be17fe4274a28c05b1caf14303da00c4aaba8224-json.log",
  "offset": 1867456,
  "time": "2018-07-19T13:21:00.650403009Z",
  "metadata": {
    "container": {
      "name": "live-snapshot-service-55d84884dc-qkfhb"
    }
  },
  "host": {
    "name": "filebeat-ttfzz"
  },
  "beat": {
    "name": "filebeat-ttfzz",
    "hostname": "filebeat-ttfzz",
    "version": "6.3.1"
  },
  "stream": "stdout",
  "log": "<REDACTED>",
  "prospector": {
    "type": "log"
  },
  "input": {
    "type": "log"
  },
  "kubernetes": {
    "labels": {
      "env": "INT",
      "pod-template-hash": "1184044087",
      "service": "live-snapshot-service"
    },
    "container": {
      "name": "live-snapshot-service"
    },
    "node": {
      "name": "<REDACTED>"
    },
    "namespace": "int"
  }
}

[TO BE CONTINUED]

[CONTINUED]

If I remove the rename completely, you can see that kubernetes.pod.name does exists and should be allowed to be renamed:

2018-07-19T17:51:56.262Z	DEBUG	[kubernetes]	add_kubernetes_metadata/matchers.go:57	Incoming source value: /var/lib/docker/containers/69f22c5276ab1418d969d2b15c1bce7f77b3c04a980026493cdd200ef48ea533/69f22c5276ab1418d969d2b15c1bce7f77b3c04a980026493cdd200ef48ea533-json.log
2018-07-19T17:51:56.262Z	DEBUG	[kubernetes]	add_kubernetes_metadata/matchers.go:80	Using container id: 69f22c5276ab1418d969d2b15c1bce7f77b3c04a980026493cdd200ef48ea533
2018-07-19T17:51:56.263Z	DEBUG	[publish]	pipeline/processor.go:291	Publish event: {
  "@timestamp": "2018-07-19T17:51:56.262Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.3.1"
  },
  "kubernetes": {
    "container": {
      "name": "filebeat"
    },
    "pod": {
      "name": "filebeat-q68f2"
    },
    "node": {
      "name": "<REDACTED>"
    },
    "namespace": "default",
    "labels": {
      "pod-template-generation": "9",
      "controller-revision-hash": "650375567",
      "k8s-app": "filebeat",
      "kubernetes": {
        "io/cluster-service": "true"
      }
    }
  },
  "host": {
    "name": "filebeat-q68f2"
  },
  "prospector": {
    "type": "log"
  },
  "log": "<REDACTED>",
  "stream": "stderr",
  "time": "2018-07-19T17:51:50.688467895Z",
  "source": "/var/lib/docker/containers/69f22c5276ab1418d969d2b15c1bce7f77b3c04a980026493cdd200ef48ea533/69f22c5276ab1418d969d2b15c1bce7f77b3c04a980026493cdd200ef48ea533-json.log",
  "input": {
    "type": "log"
  },
  "beat": {
    "name": "filebeat-q68f2",
    "hostname": "filebeat-q68f2",
    "version": "6.3.1"
  },
  "offset": 1188687
}

Hope it helps.

Thanks allot!!

Tal

One thing I started to wonder if it exists on all events or if there a few events it might be missing?

If have the rename processor in place, does it go wrong on all events or just a subset?

All events. When I put it in, I get no events into elastic search. Another proof the field exists on all events: I am now renaming the entire parent object. I search to see if I have events without the specific fields and I do not. Meaning: all events have the metadata and should be renamed.

Can you think of a way to print the intermediate value of the event after adding kubernetes metadata but before running rename? Without compiling a specialized version of filebeat? I think that if we would have done that, it would have proven that the field indeed exists in that specific event and the rename fails.

I'm pretty sure there's a problem with mapStr with getting that leaf value. Not sure how to test that theory.

Hi, sorry for the long silence on this one. I put together a test here to try to confirm this: https://github.com/elastic/beats/compare/master...ruflin:k8s-rename?expand=1 Unfortunately the test passes.

Could you check the code to see if that is really the use case you are looking for that doesn't work?

Hi, thanks for replying!

Yeah, the test looks alright.. My guess is that add_kubernetes_metadata somehow writes the fields differently than do in a way that they somehow unrenameable???

Tal

AFAIK can see k8s writes the fields the same way as all the other processors. My best guess at the moment is that it would be a timing issue. The most confusing part for me is that you can rename the field directly but not the object.

Now that I wrote this, let's try an other idea. What happens if you add the rename processoron the top level. You can specifyprocessors` inside an input but also on the top level. It "should" not have an affect but could be worth it.

Didn't make any effect.

Look at the image:

See the filter, events with metadata.labels.service field exists.

The (almost) empty parts are when the rename processor looks like this:

- rename:
          fields:
            - from: "docker.container.id"
              to: "metadata.container"
            - from: "kubernetes.labels.service"
              to: "metadata.labels.service"

          ignore_missing: true
          fail_on_error: false

The full parts are when the processor looks like this:

- rename:
          fields:
            - from: "docker.container.id"
              to: "metadata.container"
            - from: "kubernetes.labels"
              to: "metadata.labels"

          ignore_missing: true
          fail_on_error: false

I actually don't quite understand why some events are "slipping through". I guess they are old events or something. But there is definitely something off with renaming the leaf value.

We need more debug information to diagnose this, especially the value of the event object before starting the rename processor.

Tal

P.S.

I opened a feature request in elastic search to allow to embed a ingest node as a step in another ingest node. If/when that will happen, I can have different ingest nodes for different scenarios, and have the renames be done in elastic search instead of filebeat, without copy pasting the other steps: https://github.com/elastic/elasticsearch/issues/31842 and then this issue can be worked around without hustle.

I'm still stuck on this one. On the one side it is strange that it seems some events make it through, so I ask myself what is different with these events. On the other hand I suspect this could a bug but can't see yet on how this would happen.

Could you share your full config including everything you have in there (-passwords :wink: ). Just wanting to make sure we don't miss anything. Especially thinking could it be related to the internal queue or something like that.

@ruflin

Here's the full file, nothing is omitted (although some values are read externally, but I'm pretty sure it won't be a problem). Do tell if you think of another thing I can try. I'm pretty sure that events that make it though are drops of the old configuration after I'm changing it:

filebeat.inputs:
  - type: log
    paths:
      - /var/lib/docker/containers/*/*.log

    close_removed: true
    clean_removed: true

    json.keys_under_root: true

    processors:
      - add_docker_metadata: ~

      - add_kubernetes_metadata:
          in_cluster: true

      - add_cloud_metadata:
          cloud.id: ${ELASTIC_CLOUD_ID}
          cloud.auth: ${ELASTIC_CLOUD_AUTH}

      - rename:
          fields:
            - from: "docker.container.id"
              to: "metadata.container"
            - from: "kubernetes.labels.env"
              to: "metadata.env"

          ignore_missing: true
          fail_on_error: false

output.elasticsearch:
  hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
  username: ${ELASTICSEARCH_USERNAME}
  password: ${ELASTICSEARCH_PASSWORD}
  index: "filebeat-k8s-%{+yyyy.MM.dd}"
  pipeline: "filebeat"

setup.template.name: "filebeat-k8s-{[beat.version]}"
setup.template.pattern: "filebeat-k8s-%{[beat.version]}-*"
setup.template.overwrite: true

Unfortunately nothing special in the config above :frowning: One additional thing we could try:

  • Remove all processors except for the k8s and rename processor
  • Only rename the k8s field

But also this should not make a difference. I'm really stuck on this one :frowning:

As you guessed it, no effect. Note that events are reaching, but they don't contain the renamed field, as it gets deleted because I tried to rename a leaf instead of object..

Regarding what to do to track down this slippery option. How about adding super verbose logs on rename processor in debug mode in the next version (6.3.3? 6.4.0?):

  1. The input event the processor got
  2. The exact path it made through the code, to understand how the target field gets deleted
  3. The output from the processor
  4. Any errors caught here

And when it gets released, I'll update, run again and report?

As 6.4 is already in feature freeze this will not work. But we could work with custom binaries.

Are you able to compile the binary on your own and we could collaborate on where to add debug logs?

I don't have go, and never tried to compile filebeat.. But looking in the developers guide and it looks straight forward..

But I was hoping that perhaps you could provide a custom docker container on docker hub or at least a compiled custom binary to replace in the existing container, so I could run it and report..?

Would that be possible...?

If not, I'll try to install go and compile the existing source, and if that will work, we can talk about adding more logs..