I was able to reproduce this issue by running logstash locally, and I tracked it down to a recent change we made to attempt to have logstash reload its configuration.
Using this simple pipeline config:
input {
stdin {}
}
output {
cloudwatchlogs {
log_group_name => "hopcloud-operational-logs"
log_stream_name => "%instance_id%"
region => "us-west-2"
}
}
I expect this to fail on my desktop because there's no EC2 metadata service on my desktop, obviously. And it does, with a very similar error (although it times out instead of getting a connection refused):
$ docker run -ti --rm -v "$(pwd)/pipeline.conf:/usr/share/logstash/pipeline/logstash.conf" our-repo/logstash:6.3.1-4 --log.level=warn
Sending Logstash's logs to /usr/share/logstash/logs which is now configured via log4j2.properties
[2021-05-03T21:17:56,400][ERROR][logstash.pipeline ] Error registering plugin {:pipeline_id=>"main", :plugin=>"#<LogStash::OutputDelegator:0x5739af46>", :error=>"Failed to open TCP connection to 169.254.169.254:80 (execution expired)", :thread=>"#<Thread:0x52f6091a run>"}
[2021-05-03T21:17:56,408][ERROR][logstash.pipeline ] Pipeline aborted due to error {:pipeline_id=>"main", :exception=>#<Net::OpenTimeout: Failed to open TCP connection to 169.254.169.254:80 (execution expired)>, :backtrace=>["org/jruby/ext/socket/RubyTCPSocket.java:112:in `initialize'", "org/jruby/RubyIO.java:1154:in `open'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:885:in `block in connect'", "org/jruby/ext/timeout/Timeout.java:149:in `timeout'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:883:in `connect'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:868:in `do_start'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:857:in `start'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:585:in `start'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:480:in `get_response'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:457:in `get'", "/app/logstash-output-cloudwatchlogs/lib/logstash/outputs/cloudwatchlogs.rb:116:in `register'", "org/logstash/config/ir/compiler/OutputStrategyExt.java:97:in `register'", "org/logstash/config/ir/compiler/OutputDelegatorExt.java:93:in `register'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:340:in `register_plugin'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:351:in `block in register_plugins'", "org/jruby/RubyArray.java:1734:in `each'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:351:in `register_plugins'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:728:in `maybe_setup_out_plugins'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:361:in `start_workers'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:288:in `run'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:248:in `block in start'"], :thread=>"#<Thread:0x52f6091a run>"}
[2021-05-03T21:17:56,425][ERROR][logstash.agent ] Failed to execute action {:id=>:main, :action_type=>LogStash::ConvergeResult::FailedAction, :message=>"Could not execute action: PipelineAction::Create<main>, action_result: false", :backtrace=>nil}
But then, it dies! (With exit code 0, for some reason.) Which is what I want.
But in production, we recently added the flags --config.reload.automatic --config.reload.interval=30s
. When I add those to my local copy, I get the same output:
$ docker run -ti --rm -v "$(pwd)/pipeline.conf:/usr/share/logstash/pipeline/logstash.conf" our-repo/logstash:6.3.1-4 --log.level warn --config.reload.automatic --config.reload.interval=30s
Sending Logstash's logs to /usr/share/logstash/logs which is now configured via log4j2.properties
[2021-05-03T21:35:37,956][ERROR][logstash.pipeline ] Error registering plugin {:pipeline_id=>"main", :plugin=>"#<LogStash::OutputDelegator:0x1bfcd999>", :error=>"Failed to open TCP connection to 169.254.169.254:80 (execution expired)", :thread=>"#<Thread:0xd75a9be run>"}
[2021-05-03T21:35:37,965][ERROR][logstash.pipeline ] Pipeline aborted due to error {:pipeline_id=>"main", :exception=>#<Net::OpenTimeout: Failed to open TCP connection to 169.254.169.254:80 (execution expired)>, :backtrace=>["org/jruby/ext/socket/RubyTCPSocket.java:112:in `initialize'", "org/jruby/RubyIO.java:1154:in `open'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:885:in `block in connect'", "org/jruby/ext/timeout/Timeout.java:149:in `timeout'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:883:in `connect'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:868:in `do_start'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:857:in `start'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:585:in `start'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:480:in `get_response'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:457:in `get'", "/app/logstash-output-cloudwatchlogs/lib/logstash/outputs/cloudwatchlogs.rb:116:in `register'", "org/logstash/config/ir/compiler/OutputStrategyExt.java:97:in `register'", "org/logstash/config/ir/compiler/OutputDelegatorExt.java:93:in `register'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:340:in `register_plugin'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:351:in `block in register_plugins'", "org/jruby/RubyArray.java:1734:in `each'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:351:in `register_plugins'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:728:in `maybe_setup_out_plugins'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:361:in `start_workers'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:288:in `run'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:248:in `block in start'"], :thread=>"#<Thread:0xd75a9be run>"}
[2021-05-03T21:35:37,988][ERROR][logstash.agent ] Failed to execute action {:id=>:main, :action_type=>LogStash::ConvergeResult::FailedAction, :message=>"Could not execute action: PipelineAction::Create<main>, action_result: false", :backtrace=>nil}
^C[2021-05-03T21:42:56,591][WARN ][logstash.runner ] SIGINT received. Shutting down.
But as you can see, it hangs until I kill it with Ctrl+C.
This likely means that this isn't a new problem, just one that we were able to correctly retry in the past. So I'll pull these flags back out and we'll go back to the old, adequate behavior.