Logstash errors after upgrading to filebeat-6.3.0


(Greg Volk) #1

After upgrading from filebeat-6.2.4 to filebeat-6.3.0 none of my log messages make into logstash. I did not make any filebeat.yml or logstash.conf changes during the upgrade. The logstash.stdout is full of errors like this...

[2018-06-14T16:36:44,073][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-2018.06.14", :_type=>"doc", :_routing=>nil}, #<LogStash::E
vent:0x7915b5b2>], :response=>{"index"=>{"_index"=>"logstash-2018.06.14", "_type"=>"doc", "_id"=>"uPUo_2MB8dpigLNDCRGl", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [host]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"Can't get text on a START_OBJECT at 1:353"}}}}}

I reverted back to filebeat-6.2.4, everything works again and the logstash.stdout errors went away. I noticed the fifth bullet point in the 6.3.0 release notes mentions the addition of a host.name field that could break logstash configs. That bullet point also links to github PR 7051 where it says "...and it could causes issues with any LS config that expects [host] to be a string..."

I just want to confirm that this breaking change is indeed what I'm running into and I'm wondering if there is workaround I can use in filebeat.yml or logstash.conf to mitigate.

My filebeat.yml and logstash.conf files are below just in case anyone wants to look at them.

Thank you for your time.

#filebeat.yml
beat.prospectors:

  • type: log
    enabled: true
    paths:

    • /var/log/snort-eth1/alert.csv
    • /var/log/snort-eth2/alert.csv
  • type: log
    enabled: true
    paths:

    • /var/log/auth.log
      include_lines: ['Failed password for']
      exclude_lines: ['invalid user']
      tags: ["failedpw"]
  • type: log
    enabled: true
    paths:

    • /var/log/fail2ban.log
  • type: log
    enabled: true
    paths:

    • /var/log/apache2/*.log
      exclude_lines: ['GET /server-status','GET /st.html']

output.logstash:
hosts: ["localhost:5044"]

setup.kibana:
host: "127.0.0.1:5601"

logging.level: info
logging.to_files: true
logging.metrics.enabled: true
logging.metrics.period: 60s
logging.files:
path: /var/log/filebeat
name: filebeat

#logstash.conf
input {
beats {
port => 5044
}
}
filter {
if [source] =~ "access" {
mutate { replace => { type => "apache_access" } }
# append response_time_us onto predefined COMBINEDAPACHELOG
grok {
match => { "message" => "%{COMBINEDAPACHELOG} %{NUMBER:response_time_us}" }
}

date {
  match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
mutate {
  #rename 'response' field to 'status'
  rename => {"response" => "status"}
  #make a copy of IP for name resolution
  copy => {"clientip" => "client_dns_name"}
}
dns {
  # resolve the IP
  reverse => ["client_dns_name"]
  action => "replace"
}

}
}

filter {
if [source] =~ "error" {
mutate {
replace => { type => "apache_error" }
}
}
}

filter {
if [source] =~ "fail2ban" {
mutate {
replace => { type => "fail2ban" }
}
}
}

filter {
if [source] =~ "snort-eth1" {
csv {
separator => ","
columns => ["timestamp","sig_generator","sig_id","sig_rev","msg","proto","src","srcport","dst","dstport","ethsrc","ethdst","ethlen","tcpflags","tcpseq","tcpack","tcplen","tcpwindow","ttl","tos","id","dgmlen","iplen","icmptype","icmpcode","icmpid","icmpseq"]
}
mutate {
#change type
replace => { type => "snort-eth1" }
#make a copy of IP addrs for name resolution
copy => {"src" => "src_dns_name"}
copy => {"dst" => "dst_dns_name"}
}
dns {
# resolve the addresses
reverse => ["src_dns_name","dst_dns_name"]
action => "replace"
}
}
}

filter {
if [source] =~ "snort-eth2" {
csv {
separator => ","
columns => ["timestamp","sig_generator","sig_id","sig_rev","msg","proto","src","srcport","dst","dstport","ethsrc","ethdst","ethlen","tcpflags","tcpseq","tcpack","tcplen","tcpwindow","ttl","tos","id","dgmlen","iplen","icmptype","icmpcode","icmpid","icmpseq"]
}
mutate {
#change type
replace => { type => "snort-eth2" }
#make a copy of IP addrs for name resolution
copy => {"src" => "src_dns_name"}
copy => {"dst" => "dst_dns_name"}
}
dns {
# resolve the addresses
reverse => ["src_dns_name","dst_dns_name"]
action => "replace"
}
}
}

output {
elasticsearch {
hosts => ["localhost:9200"]
}

send debug data to stdout

#stdout { codec => rubydebug }
}


Mapper_parsing_exception error on logstash
Can not receive log from remote window machine
Filebeat - Should metadata overwrite event data?
Logstash errors on filebeat received apache logs
#2

That is exactly how that breaking change manifests itself. You can tell for sure by going to the elasticsearch logs, where you will see the bulk request failing and the document that fails is included in the log.


(Jonathan Ocab) #3

I was affected by this same issue. I'm trying to understand what exactly I need to do to resolve this future forward.


#4

Since this is a duplicate discussion, this case is due to the specification change of filebeat, so let's move on here.


#5

It is a big breaking change in the specifications related to the popular "host" field, so I think that it was necessary to raise a important notification.


#6

In order to avoid it on the LS side

  # Remove host metadata                                                                                      
    mutate {
      remove_field => [ "[host][name]" ]
      remove_field => [ "[host][id]" ]
      remove_field => [ "[host][architecture]" ]
      remove_field => [ "[host][os][platform]" ]
      remove_field => [ "[host][os][version]" ]
      remove_field => [ "[host][os][family]" ]
      remove_field => [ "[host][ip]" ]
      remove_field => [ "[host][mac]" ]
      remove_field => [ "[host][os]" ]
      remove_field => [ "[host]" ]
    }
    mutate {
      add_field => {
    	"host" => "%{[beat][hostname]}"
      }
    }

(Pier-Hugues Pellerin) #7

The following should be enough to fix the situation.

mutate {
      remove_field => [ "[host]" ]
    }
    mutate {
      add_field => {
    	"host" => "%{[beat][hostname]}"
      }
    }

Logstash-multiple inputs one output problem
#8

This is just a temporary workaround. I hope elastic will remove host as object feature so we could also
use this workaround. If not it means in the future there will be some dasboards, ... using host object fields so users will have to somehow migrate. Btw. this workaround is not sufficient if you already uploaded new beat elasticsearch templates.


(Eric Richter) #9

Why is this terrible feature build? Everyone that will run in this mapping error will delete (or replace) the host field. So the host field as an object will never be used. github PR 7051 suggests to remove the beat.name field. I would suggest to remove the host field as an object.


(ruflin) #10

An additional workaround is to use versioned indices as is recommend in the docs: https://www.elastic.co/guide/en/beats/filebeat/current/logstash-output.html#_accessing_metadata_fields

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}" 
  }
}

This makes sure there are not conflicts between templates / indices of different versions of Beats. This should be used also without the above issue.

I want to also share a bit of background on this change. On the Beats side we introduced the add_host_metadata processor which follows the schema from ECS: https://github.com/elastic/ecs During running some tests we found the issue that when first ingesting data through LS without the processor and then enable it, we would see the above error. The reason is that the Beats Input in LS adds a host field when no fields exists. So our solution here is to send up host.name with each event which prevents LS from adding the host field. This works as long as a new index is used for each Beat version (see example above).

We will work on improving the docs and try to come with other ways on how to migrate like introducing a config option in the beats input.


(Eric Richter) #11

OK, we make versioned indices to prevent the mapping error. The stuff is indexed, and then? The reports are corrupted, because the host field is in some indices a text field and in other indices a object field.


#12

Also putting beat version into index name breaks index size/number of shards, ... optimizations in large deployments. Image the bigger organization with hundereds of systems running multiple versions of beats because of several reasons - update policy in organization, teams separation, ... So instead of 3000 of indices one would end up with 15000 or more indices which brings optimization issues. Elastic should be very careful with changes like this because in enterprise environment it is expected to be backward compatible for a long time (not to break everything each minor version).


(Coding Mush) #13

worked for me. Thanks! :slight_smile:


My metricbeats stopped working after the latest ugprade! and I have errors on mapping 6.3.0
#14

(ruflin) #15

Here is a PR up that is planned to be added to the docs and should share some more details about the issue and how to fix it: https://github.com/elastic/beats/pull/7398


#16

I don't see how simply making users more aware of the issue helps those who are affected by it. This change has massive implications for existing stacks.

Removing/renaming the host object as a "workaround" doesn't sit well with me, as it permanently removes the ability to make use of the host metadata feature. After implementing this workaround, there is no clear way forward if you want to use this feature.

Why couldn't the new object be called host_metadata if that's what its intended purpose is? Why would you clobber a critical field that's been in use by many users for such a long time? I really can't get my head around the logic behind this at all.

In my case we have several hundred hosts all currently using the host field, and I have no idea at all how to proceed. For all intents and purposes we are stuck on 6.2.4, as the alternative is to spend days changing and testing everything in our pipeline and somehow coordinate beats upgrades across all these hosts so we don't run into mapping issues and lose production data. On top of that, even if we do manage to upgrade without data loss, we will have mapping conflicts with our historical data, on a critical identification field, making it unusable.

As far as breaking changes go, this one is atrocious.


#17

Looking at https://github.com/elastic/ecs#host, could I perhaps suggest using node or instance instead of host? Surely there are plenty of other suitable names for this object?


(ruflin) #18

Hi @ceekay

If you look at https://www.elastic.co/guide/en/beats/libbeat/current/breaking-changes-6.3.html in which use case does your setup fall? I hope I'm able to provide solutions that have as little as an impact as possible.


#19

@ruflin our use case is closest to "You use a custom template and your indices are not versioned".

The main issue here is that we already use a host field, and have done for several years. We are shipping data from several hundred hosts, spanning many different clients, and with the mix of hosts and different naming conventions that our clients have, we invariably have a number of short hostname collisions across different systems, e.g., prod-web1, etc.

These hostname collisions necessitated the use of a custom field containing FQDNs for all hosts; this being the host field, and we've been doing it this way since well before Beats existed (i.e., Logstash Forwarder, Lumberjack and Log Courier).

My initial tests with Filebeat 6.3.0 seem to indicate that we can't even ship a custom host field any more - it's simply clobbered by the host object in this version, thus we can't identify systems by their FQDN host field when using Beats 6.3.0. This is a major problem for us.

Adding to the problem is we have clients who are required to retain indices for long periods for PCI compliance reasons. If we attempted to change this mapping, we couldn't simply wait for their affected indices to roll over because of the retention periods these customers are required to have. Changing critical mappings such as the host field will also have an ongoing effect for any of their historical data, as there will be mapping conflicts lasting for months or years. Not being able to aggregate data based on host will be entirely unacceptable to these clients.

Our Logstash pipeline refers to the host field in a number of places, as do the majority of our Kibana dashboards. Many of our custom systems which query our log data from outside the Elastic Stack (monitoring, metrics, alerting, etc) also refer to the host field. Changing this mapping means we would need to modify almost everything that touches our Elastic Stack. Coordinating such a change would be a nightmare.

I am sure we are not alone in using a field named host to store hostname data. This seemingly simple change breaks everything we have been building over the past few years.

I realise there are ways we could address many of the points I've raised here, e.g., reindexing data, renaming fields in the Logstash pipeline, etc, however when you consider everything that would be required to pull this off, it's a massive amount of very risky work, just to get around a mapping change.

I do have a suggestion: If Beats could be optionally configured with a custom root object for host metadata, this would make our problems disappear entirely. As per my previous comment, if we could use something like node or instance this would suit us just fine, as its descriptive enough, and we obviously don't have any pre-existing mappings for host metadata to worry about.


(ruflin) #20

Thanks for sharing so much details about your use case, really appreciate it.

To your last point: There is a way you could already do this today. 6.3 shipped with a rename processor. So you could rename the host to node for example: https://www.elastic.co/guide/en/beats/metricbeat/current/rename-fields.html

Based on your description above I assume you have data from sources which are not beats in the same index, meaning these still have the host field and will have in the future. Is there an option to have future indices split by data type/source?

For the host field which came in the past from Beats through LS. It is actually a copy of beat.hostname and beat.hostname is still there. This means if you drop host.name on the Filebeat side, you will have the exact same behaviour as you had in 6.2, the host field will still be populated. As you use a custom template, meaning you don't have host.name define things should keep working.