Logstash errors after upgrading to filebeat-6.3.0

An additional workaround is to use versioned indices as is recommend in the docs: https://www.elastic.co/guide/en/beats/filebeat/current/logstash-output.html#_accessing_metadata_fields

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}" 
  }
}

This makes sure there are not conflicts between templates / indices of different versions of Beats. This should be used also without the above issue.

I want to also share a bit of background on this change. On the Beats side we introduced the add_host_metadata processor which follows the schema from ECS: https://github.com/elastic/ecs During running some tests we found the issue that when first ingesting data through LS without the processor and then enable it, we would see the above error. The reason is that the Beats Input in LS adds a host field when no fields exists. So our solution here is to send up host.name with each event which prevents LS from adding the host field. This works as long as a new index is used for each Beat version (see example above).

We will work on improving the docs and try to come with other ways on how to migrate like introducing a config option in the beats input.

OK, we make versioned indices to prevent the mapping error. The stuff is indexed, and then? The reports are corrupted, because the host field is in some indices a text field and in other indices a object field.

1 Like

Also putting beat version into index name breaks index size/number of shards, ... optimizations in large deployments. Image the bigger organization with hundereds of systems running multiple versions of beats because of several reasons - update policy in organization, teams separation, ... So instead of 3000 of indices one would end up with 15000 or more indices which brings optimization issues. Elastic should be very careful with changes like this because in enterprise environment it is expected to be backward compatible for a long time (not to break everything each minor version).

5 Likes

worked for me. Thanks! :slight_smile:

1 Like

Here is a PR up that is planned to be added to the docs and should share some more details about the issue and how to fix it: https://github.com/elastic/beats/pull/7398

1 Like

I don't see how simply making users more aware of the issue helps those who are affected by it. This change has massive implications for existing stacks.

Removing/renaming the host object as a "workaround" doesn't sit well with me, as it permanently removes the ability to make use of the host metadata feature. After implementing this workaround, there is no clear way forward if you want to use this feature.

Why couldn't the new object be called host_metadata if that's what its intended purpose is? Why would you clobber a critical field that's been in use by many users for such a long time? I really can't get my head around the logic behind this at all.

In my case we have several hundred hosts all currently using the host field, and I have no idea at all how to proceed. For all intents and purposes we are stuck on 6.2.4, as the alternative is to spend days changing and testing everything in our pipeline and somehow coordinate beats upgrades across all these hosts so we don't run into mapping issues and lose production data. On top of that, even if we do manage to upgrade without data loss, we will have mapping conflicts with our historical data, on a critical identification field, making it unusable.

As far as breaking changes go, this one is atrocious.

2 Likes

Looking at https://github.com/elastic/ecs#host, could I perhaps suggest using node or instance instead of host? Surely there are plenty of other suitable names for this object?

Hi @ceekay

If you look at https://www.elastic.co/guide/en/beats/libbeat/current/breaking-changes-6.3.html in which use case does your setup fall? I hope I'm able to provide solutions that have as little as an impact as possible.

@ruflin our use case is closest to "You use a custom template and your indices are not versioned".

The main issue here is that we already use a host field, and have done for several years. We are shipping data from several hundred hosts, spanning many different clients, and with the mix of hosts and different naming conventions that our clients have, we invariably have a number of short hostname collisions across different systems, e.g., prod-web1, etc.

These hostname collisions necessitated the use of a custom field containing FQDNs for all hosts; this being the host field, and we've been doing it this way since well before Beats existed (i.e., Logstash Forwarder, Lumberjack and Log Courier).

My initial tests with Filebeat 6.3.0 seem to indicate that we can't even ship a custom host field any more - it's simply clobbered by the host object in this version, thus we can't identify systems by their FQDN host field when using Beats 6.3.0. This is a major problem for us.

Adding to the problem is we have clients who are required to retain indices for long periods for PCI compliance reasons. If we attempted to change this mapping, we couldn't simply wait for their affected indices to roll over because of the retention periods these customers are required to have. Changing critical mappings such as the host field will also have an ongoing effect for any of their historical data, as there will be mapping conflicts lasting for months or years. Not being able to aggregate data based on host will be entirely unacceptable to these clients.

Our Logstash pipeline refers to the host field in a number of places, as do the majority of our Kibana dashboards. Many of our custom systems which query our log data from outside the Elastic Stack (monitoring, metrics, alerting, etc) also refer to the host field. Changing this mapping means we would need to modify almost everything that touches our Elastic Stack. Coordinating such a change would be a nightmare.

I am sure we are not alone in using a field named host to store hostname data. This seemingly simple change breaks everything we have been building over the past few years.

I realise there are ways we could address many of the points I've raised here, e.g., reindexing data, renaming fields in the Logstash pipeline, etc, however when you consider everything that would be required to pull this off, it's a massive amount of very risky work, just to get around a mapping change.

I do have a suggestion: If Beats could be optionally configured with a custom root object for host metadata, this would make our problems disappear entirely. As per my previous comment, if we could use something like node or instance this would suit us just fine, as its descriptive enough, and we obviously don't have any pre-existing mappings for host metadata to worry about.

1 Like

Thanks for sharing so much details about your use case, really appreciate it.

To your last point: There is a way you could already do this today. 6.3 shipped with a rename processor. So you could rename the host to node for example: https://www.elastic.co/guide/en/beats/metricbeat/current/rename-fields.html

Based on your description above I assume you have data from sources which are not beats in the same index, meaning these still have the host field and will have in the future. Is there an option to have future indices split by data type/source?

For the host field which came in the past from Beats through LS. It is actually a copy of beat.hostname and beat.hostname is still there. This means if you drop host.name on the Filebeat side, you will have the exact same behaviour as you had in 6.2, the host field will still be populated. As you use a custom template, meaning you don't have host.name define things should keep working.

Thanks - I'll need to test the processor on Monday though. It's well outside work hours now in my timezone.

As for the host field, we set it directly in the Beat's config (resolved from Puppet or Ansible facts) - it's not added at the LS Beats input. This is the only reliable way we've found to get a system's FQDN, as Go doesn't appear to be able to resolve them. The beat.hostname field is a short hostname, so that would cause the name collisions mentioned above.

Given that we're adding host as part of the Beat's config, will the processor rename the new host object before our custom field is added? If not I'm not sure how this solution would work.

Here's an example config to explain:

filebeat:
  registry_file: /var/lib/filebeat/registry
  config.prospectors:
    enabled: true
    path: /etc/filebeat/conf.d/*.yml
fields:
  host: sample.host.fqdn
  timezone: Pacific/Auckland
fields_under_root: true
output:
  logstash:
    hosts: ["ingress1.xxx.xxx:10514", "ingress2.xxx.xxx:10514"]
    compression_level: 3
    ssl:
      certificate_authorities: ["/etc/ssl/certs/ingress.crt"]
      verification_mode: full
[...]

I see, you set host directly on the Beats level. I need to check in detail in what order the processors are applied.

I also need to test your above config on what the affect is. It should probably work as from a Beats perspective it's fine to have host and host.name. To remove only 1 field you could drop just host.name instead of host. Let me know if that works.

Side note: My hope is that we can actually solve the FQDN problem with the add_host_metadata processor and have a config option there for making host.name FQDN.

My initial testing showed that our custom field wasn't making it to the output - there was no FQDN at all in the Filebeat 6.3.0 messages, but thehost.name field was present.

Reverting back to 6.2.4 with the same config returned it to what we'd normally expect to see.

If we can possibly keep the host metadata (but renamed!) that would be great as it looks pretty useful.

[edit]
OK I just tried the rename processor (renaming host -> node) on a dev box and here's what I got. The custom host field was suppressed, unfortunately.

6.3.0 output:
{
  "@timestamp": "2018-06-29T13:57:54.914Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.3.0"
  },
  "beat": {
    "name": "dev-mgmt1",
    "hostname": "dev-mgmt1",
    "version": "6.3.0"
  },
  "offset": 49993950,
  "node": {
    "name": "dev-mgmt1"
  },
  "prospector": {
    "type": "log"
  },
  "platform": "Development",
  "type": "latency-test",
  "input": {
    "type": "log"
  },
  "timezone": "Pacific/Auckland",
  "source": "/var/log/latency-test.log",
  "message": "1530280673.966",
  "client_id": “dev”,
}
6.2.4 output:
{
  "@timestamp": "2018-06-29T14:03:31.858Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.2.4"
  },
  "message": "1530281011.585",
  "source": "/var/log/latency-test.log",
  "offset": 49999020,
  "platform": "Development",
  "prospector": {
    "type": "log"
  },
  "timezone": "Pacific/Auckland",
  "type": "latency-test",
  "client_id": “dev”,
  "host": "dev-mgmt1.fqdn.xxx”,
  "beat": {
    "name": "dev-mgmt1",
    "hostname": "dev-mgmt1",
    "version": "6.2.4"
  }
}

Thanks for testing. I need to test this also on my end but an other thing you could try is adding host field under and other name and then use the rename processor after you renamed the host.* fields. But not 100% sure it will work.

The general problem is that fields: * existed before the processor existed which makes it a bit special, meaning you can define the order it is executed. That is why I want to replace it by a processor: https://github.com/elastic/beats/issues/7350 If the above does not work, sounds like we should increase the priority for it.

I'll try the host rename a bit later - it's after 2am here and I probably shouldn't be working.

Just a note on issue #7350: If adding fields is replaced by a processor, will there be allowances to continue adding them in external prospectors? Our prospectors are dropped in as fragments, and also contain root fields, such as client_id and type.

The rename appears to work fine. I've got the results I need with this config, adding a field named fqdn instead of host:

[...]
fields:
  fqdn: dev-mgmt1.fqdn.xxx
processors:
- rename:
    fields:
     - from: "host"
       to: "node"
     - from: "fqdn"
       to: "host"
    ignore_missing: true
[...]

This gives the following output:

[...]
  "host": "dev-mgmt1.fqdn.xxx,
[...]
  "node": {
    "name": "dev-mgmt1"
  },
[...]

My only reservation here is, would this be considered a hack? Is it going to cause us problems down the line?

Thanks for all your help so far @ruflin

For your first question about the processor: Processors also work on the prospector level so this feature would keep working. A replacement for fields config option would have to same or more functionality. The interesting part is as soon as we have a processor, we can add more config options like overwrite_existing etc.

I would not consider the above a hack so you should be fine also in the long run. But as you know, these things are tricky :wink:

One note for the host.* in general: If you have the option and see a path forward where you can introduce it in your ingestion tooling in the long run, I would recommend it as we will start to use host.* more and more in 7.0.

Let me know if I can help further.

For clarification, this would go in a Logstash conf file?

So the following example conf file...

input {
  beats {
    port => 5044
  }
}

output {
  elasticsearch {
    hosts => "localhost:9200"
    index => "metricbeat"
  }
}

Would be this?

input {
  beats {
    port => 5044
  }
}

output {
  elasticsearch {
    hosts => "localhost:9200"
    index => "metricbeat"
  }
}
mutate {
  remove_field => [ "[host][name]" ]
  remove_field => [ "[host][id]" ]
  remove_field => [ "[host][architecture]" ]
  remove_field => [ "[host][os][platform]" ]
  remove_field => [ "[host][os][version]" ]
  remove_field => [ "[host][os][family]" ]
  remove_field => [ "[host][ip]" ]
  remove_field => [ "[host][mac]" ]
  remove_field => [ "[host][os]" ]
  remove_field => [ "[host]" ]
}
 mutate {
  add_field => {
    "host" => "%{[beat][hostname]}"
  }
}

Hi @mksavic

input {
  beats {
    port => 5044
  }
}

output {
  elasticsearch {
    hosts => "localhost:9200"
    index => "metricbeat"
  }
}

filter {
  mutate {
    remove_field => [ "[host]" ]
  }
  mutate {
    add_field => {
      "host" => "%{[beat][hostname]}"
    }
  }
}
3 Likes