How to disable add_host_metadata processor?

How can I disable the built-in add_host_metadata processor in filebeat >= 6.3.x?

My events already contain a host field with a client IP address that now gets overwritten by the host metadata (I'm attempting to upgrade from 6.2 to 6.3 (eventually targeting 6.5)).

My prospector configs look like this:

filebeat.prospectors:
- type: log
  fields:
    event_type: structlog
  fields_under_root: true
  json.keys_under_root: true
  json.add_error_key: true
  json.overwrite_keys: true
  paths:
    - /logs-pickup/*/instance?-json.log*

Even though I have json.overwrite_keys enabled in the prospectors, this doesn't seem to have an effect - the host field still contains an object with the host metadata:

logstash_1           |           "host" => {
logstash_1           |         "containerized" => true,
logstash_1           |          "architecture" => "x86_64",
logstash_1           |                  "name" => "e243941ad7e5",
logstash_1           |                    "os" => {
logstash_1           |              "version" => "7 (Core)",
logstash_1           |               "family" => "redhat",
logstash_1           |             "codename" => "Core",
logstash_1           |             "platform" => "centos"
logstash_1           |         }

This also breaks the GeoIP plugin (which I have pointed at the host field for its source) and unnecessarily inflates all my events with data I don't need.

Ok, it seems it might not necessarily be just the add_host_metadata processor that's writing to my host field.

After testing various combinations of json.overwrite_keys in the prospectors, processors.add_host_metadata attempting to drop the host field in a processor, and add_hostname => false in the logstash beats input plugin, I'm seeing basically 4 different results (none of which are the desired one, my payload in the host field):

  • host field completely absent
  • "host": {"name":"dc79b87d3f81"}
  • "host": "dc79b87d3f81"
  • "host": {"name":"dc79b87d3f81", ..., [full metadata as shown above]}

This is so frustrating. I could live with a solution where I would rename the host field containing my intended payload string to something like client_ip, before filebeat overwrites it, and leave the unneeded host metadata in, but I can't figure out a way to do that either.

(FYI, I'm running filebeat via the official docker image. Could this possibly also be related to some docker autodiscover hint magic?)

Hi @lukasg

I'm not sure about how to solve your problem (yet). In current versions of Filebeat add_host_metadata is an option of the reference file (yml) and you just have to remove it if you don't want to use it.

It's just that I don't remember now how it was in 6.3, sorry but I'd bet that you can disable it by removing the option from the file.

Can you link me to the Docker image you are using? Maybe it's activated by default there but also on the file.

A workmate has also passed me this if this helps you: https://www.elastic.co/guide/en/beats/libbeat/6.3/breaking-changes-6.3.html#breaking-changes-mapping-conflict

He also mentioned that the host.name will still be written even if you disable add_host_metadata and that it's not recommended to use a processor to remove it.

In the previous link, you can find some alternatives to this issue. I hope it helps

Hey @Mario_Castro

Thanks for your response. It turns out, add_host_metadata wasn't the culprit after all. As you described, it's only enabled by default via the reference yml file, and since I'm using a config that doesn't include that processor, it's not what's actually overwriting the host field in my case, it must be something else.

He also mentioned that the host.name will still be written even if you disable add_host_metadata

That's exactly what I'm seeing ("host": {"name":"dc79b87d3f81"})

and that it's not recommended to use a processor to remove it.

Well, it doesn't seem that I have a lot of options here. I absolutely, positively need the payload data from my event's host field (contains the remove client IP in structured (JSON) web server access logs).

And it doesn't seem that there's a way to rename my own host field to something else before filebeat overwrites it (in order to preserve its contents). I would be totally fine with renaming my field to something like client_ip, but I can't change the structure of the incoming JSON files that filebeat reads - and in those, the field is called host unfortunately.

I've did read the breaking changes section you linked (should've mentioned that in my original post, sorry), but unfortunately it doesn't cover my case at all - how to preserve data from an actual payload field called host, when it conflicts with the host fields from ECS / filebeat.

When your workmate says it's not recommended to use a processor to remove it, could you please ask him to elaborate on that a bit, if you get a chance? What kinds of issues could I be facing if I do that?

Because I managed to find a workaround that kinda does the trick for me.

The problem seems to be that json.overwrite_keys: true doesn't work to override the host field, but using the decode_json_fields processor (instead of the json reader in the prospector/input) and enabling overwrite_keys for it actually does.

Can you link me to the Docker image you are using?

I managed to reproduce the behavior without the Docker image (see below). But I was using the filebeat 6.5.4 image ("gave up" on 6.3 and tried to investigate & reproduce the issue directly with 6.5). But in the end I didn't really see a difference in behavior between docker vs. running FB on a real Centos machine.


So, to summarize with a minimal testcase (on a Centos 7.4 machine, taking docker and logstash out of the mix):

The logfile contains events (one JSON object per line) with the following shape:

{"host": "192.168.10.10", "user": "john.doe"}

This config does not work:

filebeat.inputs:
  - type: log
    paths:
       - /var/log/sample-*.log
    json.keys_under_root: true
    json.add_error_key: true
    json.overwrite_keys: true

output.file:
  path: "/var/log"
  filename: filebeat.out.log
  codec.json:
    pretty: true

It results in the following output (fb.local is the machine's hostname, "192.168.10.10" is the value that should have been preserved for the host field according to json.overwrite_keys):

{
  "@timestamp": "2018-12-28T10:11:34.719Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.5.4"
  },
  "source": "/var/log/sample-2018-12-28 11:11:32+01:00.log",
  "offset": 0,
  "json": {
    "host": "192.168.10.10",
    "user": "john.doe"
  },
  "prospector": {
    "type": "log"
  },
  "input": {
    "type": "log"
  },
  "host": {
    "name": "fb.local"
  },
  "beat": {
    "name": "fb.local",
    "hostname": "fb.local",
    "version": "6.5.4"
  }
}

The following config works however:

filebeat.inputs:
  - type: log
    paths:
       - /var/log/sample-*.log

processors:
  - decode_json_fields:
      fields: ["message"]
      process_array: true
      max_depth: 2
      target: ""
      overwrite_keys: true
  - drop_fields:
      fields: ["message"]

output.file:
  path: "/var/log"
  filename: filebeat.out.log
  codec.json:
    pretty: true

Resulting in this output, where the host field's content is preserved:

{
  "@timestamp": "2018-12-28T10:07:39.306Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.5.4"
  },
  "prospector": {
    "type": "log"
  },
  "input": {
    "type": "log"
  },
  "host": "192.168.10.10",
  "beat": {
    "name": "fb.local",
    "hostname": "fb.local",
    "version": "6.5.4"
  },
  "user": "john.doe",
  "source": "/var/log/sample-2018-12-28 11:07:32+01:00.log",
  "offset": 0
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.