Logstash pipeline continue on error?

We're running a logstash 6.0.0 instance w/ an http_input plugin. The plugin expects to receive an array of objects. Some bad data made it to the instance last night that was an array w/ an integer in it instead of an object. That caused this error:

2017-11-22T05:08:51,063][FATAL][logstash.runner ] An unexpected error occurred! {:error=>#<NoMethodError: undefined method empty?' for 1511326405115:Fixnum>, :backtrace=>["/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-split-3.1.4/lib/logstash/filters/split.rb:89:in block in filter'", "org/jruby/RubyArray.java:1734:in each'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-split-3.1.4/lib/logstash/filters/split.rb:88:in filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:145:in do_filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:164:in block in multi_filter'", "org/jruby/RubyArray.java:1734:in each'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:161:in multi_filter'", "/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:48:in multi_filter'", "(eval):428:in block in filter_func'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:501:in filter_batch'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:477:in worker_loop'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:439:in `block in start_workers'"]}

The pipeline uses a persistent queue. Logstash quit when it encountered this error and restarted. It then tried to process the event again, which generated the same error and it quit again. Infinite loop time. The cpu spiked and the VM crashed.

I read some discussion about a pipeline.continue_on_error setting that was in version 5.x but I don't see it in 6.0? The dead letter queue seems nice but looks like it only works w/ elasticsearch outputs. I'm probably just missing something. Anyway, all this was to ask how can I configure logstash to not quit on error? Ideally I'd like it to log the error and then move on to the next event.

Thanks!

In my opinion, if you can't guarantee that the field value you are splitting is an array you should test the event in a conditional before the split filter.
If field normally has an array and its not empty then [field][0] should return a "truthy" value and if not it returns a nil which is "falsey".

  if [field][0] {
    split {
      
    }
  }

Thanks Guy. I agree some defensive programming is in order to guard against this but unfortunately bad data sometimes gets missed. I'm hoping there's a mechanism for dealing w/ exceptions in a pipeline that I'm just not seeing? Doesn't seem like an event exception should crash logstash and the VM it's in.

Yes, I agree that exceptions should not halt LS.

An issue that is a very hard problem to solve is, more for plugin authors, deciding whether an exception is soft or hard and what the correct response should be.

We are adding Dead Letter Queue support to plugins, but they have to be individually coded to make the decision to stop LS, a bug that causes every event to fail or a transient event related one that should be written to the DLQ.

Also, we are still deciding on how you the user should come to know that events are piling up in the DLQ, how to inspect them and how to go about building a fix for the DLQ entries. Eventually, with consistent DLQ support across all plugins, the DLQ will have entries with different kinds of faults so the UX for split, repair and ingest is a hard one to crack.

At the moment, we don't have enough metadata on where the event came from and which config it travelled through etc.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.