Logstash stalling during auto-reconfigure even though inflight_events = 0

Hello,

I have logstash so that it will autoreconfigure. When it reconfigures, it works fine sometimes, however, sometimes it will randomly give the message

[ERROR][logstash.shutdownwatcher ] The shutdown process appears to be stalled due to busy or blocked plugins. Check the logs for more information.

And then I can see that the restart is stalling with 0 inflight_count:
[2018-03-24T03:54:15,150][WARN ][logstash.shutdownwatcher ] {"inflight_count"=>0, "stalling_thread_info"=>{["LogStash::Filters::Mutate", {"rename"=>{"a"=>"%{b}"}, "id"=>"f0bc5038d58971608846f9c9deca78509541b842fcd91bde2f37e2aa531b8acd"}]=>[{"thread_id"=>54, "name"=>nil, "current_call"=>"[...]/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:133:ininitialize'"}, {"thread_id"=>55, "name"=>nil, "current_call"=>"[...]/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:133:in initialize'"}]}} [2018-03-24T03:54:20,149][WARN ][logstash.shutdownwatcher ] {"inflight_count"=>0, "stalling_thread_info"=>{["LogStash::Filters::Mutate", {"rename"=>{"a"=>"%{b}"}, "id"=>"f0bc5038d58971608846f9c9deca78509541b842fcd91bde2f37e2aa531b8acd"}]=>[{"thread_id"=>54, "name"=>nil, "current_call"=>"[...]/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:133:ininitialize'"}, {"thread_id"=>55, "name"=>nil, "current_call"=>"[...]/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:133:in initialize'"}]}} [2018-03-24T03:54:25,150][WARN ][logstash.shutdownwatcher ] {"inflight_count"=>0, "stalling_thread_info"=>{["LogStash::Filters::Mutate", {"rename"=>{"a"=>"%{b}"}, "id"=>"f0bc5038d58971608846f9c9deca78509541b842fcd91bde2f37e2aa531b8acd"}]=>[{"thread_id"=>54, "name"=>nil, "current_call"=>"[...]/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:133:ininitialize'"}, {"thread_id"=>55, "name"=>nil, "current_call"=>"[...]/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:133:in initialize'"}]}} [2018-03-24T03:54:30,150][WARN ][logstash.shutdownwatcher ] {"inflight_count"=>0, "stalling_thread_info"=>{["LogStash::Filters::Mutate", {"rename"=>{"a"=>"%{b}"}, "id"=>"f0bc5038d58971608846f9c9deca78509541b842fcd91bde2f37e2aa531b8acd"}]=>[{"thread_id"=>54, "name"=>nil, "current_call"=>"[...]/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:133:ininitialize'"}, {"thread_id"=>55, "name"=>nil, "current_call"=>"[...]/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:133:in initialize'"}]}}

It looks like it is the last line in my filter that it is stalling at and it seems to do this regardless of what the content of the last line actually is.

I'd appreciate some help, thanks!

We recently found a race condition in how pipelines are shutdown, and have a fix in a pull request as of earlier today.

It should make it into 6.3.0 in the next couple weeks.

Thanks for the info.

I'm running Logstash with those changes in right now and it seems to no longer stall, but, I get another error occasionally during reconfiguration:

[2018-03-24T07:50:01,774][FATAL][logstash.runner ] An unexpected error occurred! {:error=>#<NoMethodError: undefined method map' for nil:NilClass Did you mean? tap>, :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/util.rb:40:in thread_info'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:668:in block in plugin_threads_info'", "org/jruby/RubyArray.java:2486:in map'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:668:in plugin_threads_info'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:672:in stalling_threads_info'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_reporter.rb:67:in block in to_hash'", "/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:80:in inflight_batches'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_reporter.rb:56:in to_hash'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_reporter.rb:51:in snapshot'", "/usr/share/logstash/logstash-core/lib/logstash/shutdown_watcher.rb:88:in pipeline_report_snapshot'", "/usr/share/logstash/logstash-core/lib/logstash/shutdown_watcher.rb:63:in block in start'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/stud-0.0.23/lib/stud/interval.rb:20:in interval'", "/usr/share/logstash/logstash-core/lib/logstash/shutdown_watcher.rb:59:in start'", "/usr/share/logstash/logstash-core/lib/logstash/shutdown_watcher.rb:35:in `block in start'"]}

Looks like it's caused by backtrace = thread.backtrace.map do |line| in thread_info of util.rb?

Is there a nil check missing for thread.backtrace?

d'oh! I found that one too and had a PR sitting unmerged until just now:

Looks like it's been a productive day for the Elastic team!

Thanks a lot!

Is pipeline.workers=1 a workaround for this?

probably?

I've not done the mental gymnastics to validate, but on the surface I think this would likely eliminate the race condition.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.